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GLOBAL DIFFERENTIAL GEOMETRY - AN INTRODUCTION FOR CONTROL ENGINEERS 


B. F. Doolin* and C. F. Martin** 
Ames Research Center 


This publication has been written to acquaint engineers, especially control 
engineers, with the basic concepts and terminology of modern global differential 
geometry. The ideas discussed are applied here mainly as an introduction to the Lie 
theory of differential equations and to the role of Grassmannians in control systems 
analysis. To reach these topics, the fundamental notions -of manifolds, tangent 
spaces, vector fields, and Lie algebras are discussed and exemplified. An appendix 
reviews such concepts needed for vector calculus as open and closed sets, compactness, 
continuity, and derivative. 

Although the content is mathematical, this is not a mathematical treatise. 

Several excellent introductions to modern differential geometry exist, but they are 
written for readers with a strong mathematical, rather than engineering, background. 
Reading this publication should help an engineer to read those treatises, as well as 
to understand the points, if not the detailed arguments, of research papers on 
geometric control, and many of those on nonlinear control. 


INTRODUCTION 


This report presents some basic concepts, facts of global differential geometry, 
and some of its uses to a control engineer. It is not a mathematical treatise; the 
subject matter is well developed in many excellent books, for example, in refer- 
ences 1, 2, and 3, which, however, are intended for the reader with an extensive 
mathematical background. Here, only some basic ideas and a minimum of theorems and 
proofs are presented. Indeed, a proof occurs only if its presence strongly aids 
understanding. Even among basic ideas of the subject, many directions and results 
have been neglected. Only those needed for viewing control systems from the stand- 
point of vector fields are discussed. 

Differential geometry treats of curves and surfaces, the functions that define 
them, and transformations between the coordinates that can be used to specify them. 

It also treats the differential relations that stitch pieces of curves or surfaces 
together or that tell one where to go next. 

In thinking of functions that can define surfaces in space, one is likely to 
think of real functions (functions assigning a real number to a given point of their 
argument) of three-space variables such as the kinetic energy of a particle, or the 
distribution of temperature in a room. Differential geometry examines properties 
inherent in the surfaces these functions define that, of course, are due to the 
sources of energy or temperature in the surroundings. Or, given enough of these 
functions, one might use them as proper coordinates of a problem. Then the generali- 
ties of differential geometry show how to operate with them when they are used, for 
example, to describe a dynamic evolution. 


*Currently with Computer Sciences Corporation, Mountain View, California. 

**Senior National Research Council Associate. Currently with Case Western 
Reserve University, Cleveland, Ohio. 


Differential geometry, in sum, derives general properties from the study of func- 
tions and mappings so that methods of characterization or operation can be carried 
over from one situation to another. Global differential geometry refers to the 
description of properties and operations that are good over "large” portions of space. 

Though the studies of differential geometry began in geodesy and dynamics where 
intuition can be a faithful guide, the spaces now in this geometry’s concern are far 
more general. Instead of considering a set of three or six real functions on a space 
of vectors of three or six dimensions, spaces can be described by longer ordered 
strings of numbers, by sets of numbers ordered in various ways, by ordered sets of 
products of numbers. Examples are n-dimensional vector spaces, matrices, or multi- 
linear objects like tensors. It is not just these sets of numbers, but also the rules 
one has of passing from one set to another that form the proper subject matter of 
differential geometry, and which link it to matters of interest in control. 

All analytic considerations of geometry begin with a space filled with stacks of 
numbers. Before one can proceed to discuss the relations that associate one point 
with another or dictate what point follows another, one has to establish certain 
ground rules. The ground rules that say if one point can be distinguished from 
another, or that there is a point close enough to wherever you want to go, are 
referred to as topological considerations. The basic description of the topological 
spaces underlying all the geometry of this paper is given in an appendix on fundamen- 
tals of vector calculus. This appendix discusses such desired topological character- 
istics as compactness and continuity, which is needed to preserve these characteris- 
tics in passing from one space to another. The appendix concludes by recalling two 
theorems from vector calculus that provide the basic glue by which manifolds, the word 
for the fundamental spaces of global differential geometry, are assembled. Since this 
discussion is fundamental to differential geometry, we briefly review it. The review 
is relegated to an appendix, however, because it is not the topic of this paper, nor 
should one dwell on it. 

The first two sections of the body of the paper describe manifolds, the spaces of 
our geometry. Some simple manifolds are mentioned. Several definitions are given, 
starting with one closest to intuition then passing to one perhaps more abstract, but 
actually less demanding to verify in cases of interest in control engineering. Then 
mappings between manifolds are considered. A special space, the tangent space, is 
discussed in section 3. A tangent space is attached to every point in the manifold. 
Since this is where the calculus is done, it and its relations to neighboring tangent 
spaces and to the manifold that supports it must be carefully described. 

Computation in these spaces is the topic of the next two sections. Calculus on 
manifolds is given in section 4 on vector fields and their algebra, where the connec- 
tion between global differential geometry and linear and nonlinear control begins to 
become clear. Section 5, with its treatment of some algebraic rules, concludes our 
exposition of the fundamentals of the geometry. 

The examples given as the development unfolds should not only help the reader 
understand the topic under discussion, but should also provide a basic set for testing 
ideas presented in the current literature. More comprehensive applications of differ- 
ential geometry to control are given in the final major section of the paper. 
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MANIFOLDS AND THEIR MAPS 


The first part of this section is devoted to the concept of a manifold. It is 
defined first by a projection then by a more useful though less intuitive definition. 
Finally, it is seen how implicitly defined functions give manifolds. Examples are 
considered both to enhance intuition and to bring out conceptual details. The idea of 
a manifold is brought out more clearly by considering mappings between manifolds. The 
properties of these mappings occupy the last part of this section. 


Differentiable Manifolds 

Although the detailed global description of a manifold can be quite complicated, 
basically a differentiable manifold is just a topological space (X,^2) that in the 
neighborhood of each point looks like an open subset of 6?^. (In the notation (X,J2), 

X is some set and ^ consists of all the sets defined as open in X and that char- 
acterize its topology. As to the notation (R^, each point in (R^ is specified as an 
ordered set of k real numbers. These and other notions arising below are discussed 
in the appendix.) This description can be formalized into a definition: 

A subset M of (R^ is a k-dimensional manifold if for each x € M there 
are: open subsets U and V of (R^ with x G U, and a dif feomorphism f 

from U to V such that: 

f(U n M) = {y e V : = . . . = y’^ = 0} 

Thus, a point y in the image of f has a representation like: 

y = (y^(x), y^(x) y^(x) , 0, . . .0) 

A straight line is a simple example of a one-dimensional manifold, a manifold in 
It is a manifold in even if it is given, for example, in (R^. There it might 
represent the surface of solutions of the equation of a particle of unit mass under no 
forces: x = 0 and with given initial momentum: x(t = 0) = a. In the coordinate 
system y^ = x; y^ = x - a, the manifold is given by the points (yi,0). To the par- 
ticle, its whole world looks like part of (R^ though we see its tracks clearly as 
part of 61 ^. Any open subset of the straight line is also a one-dimensional manifold, 
but a closed subset of it is not. 

The sphere in 6?^ is an example of a two-dimensional manifold. It is an example 
of a closed manifold and is often denoted as S^. Thus, for a point P in (R^ : 

P = (xi,X 2 ,X 3 >, the manifold is given as the set: 

= {P e (R^ : x^ + x^ + x^ - 1 = 0} 

Its two-dimensional character is clear when a point in is given in terms of two 

variables, say, latitude and longitude. Another map of into is given by 

stereographic projections. Since this map not only has historical interest but also 
will be used later, it will now be discussed to show that is a two-dimensional 

manifold. 

Let U(r;P) be an r-neighborhood of a given point P of in M = such 

that U n M is the set 
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U n M = {P : Xi + x| + x| - 1 = 0 ; 


x^ + x| > e(l + X 3 ) ; 


e > 0 } 


Then define the stereographic projection of a point P into the plane as the function 
from U n M to (R®: 

/ Xi X2 \ 

^e.p = = \T^ ’ ’ V 

The mapping is illustrated in sketch (a) , where the following ratios 



Sketch (a) 

can be seen to hold: ^ 2/^2 “ = (1 - Xg)/!. The projection is generated 

by drawing a line from the "North Pole" (0,0,1) to a point on the sphere and continu- 
ing the line to the plane Xg = 0. Thus, a point of the sphere is associated with a 
point on the plane and vice versa. 

The map is written fe,P to call attention to the important role that the 
parameter e plays in restricting its domain of definition. With the restriction, 
f can be shown to be a dif feomorphism; without it, the function is not. 

The one function is not enough to map the whole manifold. The point (0,0,1) and 
some e neighborhood of it on the manifold have been excluded. Another similar map 
that includes these points but excludes others can be given by a stereographic pro- 
jection from the "South Pole" 

f X2 

®e,P “ VI + X 3 ’ 1 + X 3 ’ 

with ge,p defined on the set 

{P : x^ + X 2 + Xg - 1 = 0 ; x^ + Xj > e(l - Xj) ; e > 0} 
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If the parameter e is given the value unity, f maps the lower hemisphere and 
g maps the upper hemisphere onto the interior of the unit circle on the plane that 
coincides with the plane Xg = 0 . Agreeably, the points of the sphere where X3 = 0 
go into the same points of u^ and U2 under both maps. 

The examples of one- and two-dimensional manifolds so far have been sets given in 
some and mapped into or . Sets forming manifolds are not always described 

naturally in some To embed them in an before showing that the definition is 

satisfied may be an undesirably awkward task. In fact, it is not necessary, and we 
will extend our previous definition so as to avoid it. That labor, however, will be 
avoided only at the expense of our introducing more formalism now. 

Let M be a second countable, Hausdorff topological space. A chart ■ in M is a 
pair (V,a) with V an open set and a a C” function onto an open set in and 

having a C” inverse. A C°° atlas is a set of such charts, {(V£,a^)} = A, with the 
following properties: 

(i) M = UV^ 

(ii) If and (V^,a2) are in A and 

Vi n Vg <t>» then 

: ai(V^ n Vj) n V 

is a C" dif feomorphism. 

Sketch (b) , which illustrates the subsets Vj, and Vj and maps and Oj may 
aid in picturing the content of condition (ii) . With this formalism established, our 
second definition of a manifold can now be given: 

A C°° manifold is a pair (M,A) where M is the second 
countable Hausdorff topological space, and A is a maximal 
C“ atlas. 
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The conditions on the topology guarantee that the number of charts required to 
cover M is countable. The word ”maximal" gives a technical condition. It makes the 
atlas the class of collections of just enough charts to form a countable basis of 
charts. By referring to the class, one is not tied to a representation given by a 
particular set of charts. 

Although the definition seems unduly complicated, it turns out to be just what is 
necessary to meet our intuition. Every m-dimensional manifold determined by the 
definition can in fact be considered as a subset of for some n:m<n< 2 m+l. 

Any weakening of the definition can allow objects which cannot be embedded in some 

iR^. 


We opened this discussion of differentiable manifolds with the remark that basi- 
cally a differentiable manifold is a topological space that in the neighborhood of 
each point looks like an open subset of (R^. The first definition said that each 
neighborhood, even though expressed as a subset of (R^, was equivalent to <R^. That 
Is 9 the space expressed in really only had k, not n, degrees of freedom. 

Another way of saying this is by saying that a k-dimensional manifold can be 
expressed using n variables with n-k conditions imposed on them. 

These remarks are made because in practice, manifolds are often given as the set 
of points where a certain function vanishes. The implicit function theorem gives 
conditions under which the vanishing of the function gives k constraints (exchanging 
the k and n-k of the previous paragraph) , so that only n-k of the variables are 
free, and the space is a manifold with dimensions n-k if the theorem is satisfied 
everywhere. Then the manifold is said to be given implicitly, or by the implicit 
function theorem. 

Formalizing the above remarks, we consider a C" function F with domain 
A C (R^ and range in (R^. That is, for every choice of n real numbers (xi,...,Xj^) 
in A, the function F has the k real numbers F = (f ^ , . . . ,f]^) . Let M be the 
set 

M = {x : F(x) = 0 = (0,0,. ..,0)} 

If the rank of the Jacobian matrix F’ is equal to k for all x e M, then M is an 
n-k-dimensional manifold. 

Under the conditions stated, the implicit function theorem says that k of the 
variables can be expressed in terms of the other n-k, and the latter can be given 
values arbitrarily. Another statement of the implicit function theorem (see ref. 4, 
pg. 43) shows that a coordinate transformation can be found that assigns the value 
zero to the k explicit functions. In other words, the conditions of the first 
definition of a manifold are satisfied. 


Examp les 

Consider the real function F = a^x^^ + a 2 X 2 + ^ 3 X 3 - b = 0. It is clear that 
a^x^^ + a 2 X 2 + ^ 3 X 3 - b = 0 describes a plane, a two-dimensional manifold, in (R^. 

It is not difficult to imagine a change of coordinates that reorients (R^ so that 
every point in the given plane can be written as (yijy 2 » 0 ), satisfying the first 
definition for a two-dimensional manifold. One also sees that the Jacobian matrix of 
F is F' = (a^ , 32 ,^ 3 ) which has rank one for all x in F(x) = 0. The implicit 
function theorem, then, says the manifold is of dimension 3 - 1 =2. 


6 



Another example of using the implicit function theorem is given by the two- 
dimensional manifold S^. Here F = + x| + Xg - 1 = 0 , and = (2x3^,2x252x3). 

Now, F’ is not zero because not all x^ vanish simultaneously, for F = 0 is not 
satisfied by Xi = X2 = Xg = 0 . Thus, F^ has rank one and the manifold has 

dimension 3 - 1 =2. 

Let a second condition be imposed on the space. For instance, consider the 
circle resulting from passing a plane containing the origin through S^. In particu- 
lar, consider the function 


F = (R^ 


with as before. 




and 


fa “ - axj = 0 

The manifold M = {x : F = 0 } is the circle S^. The Jacobian matrix of F is: 

/2xi 2x2 2 x3\ 

o) 

the rank of which is two everywhere on the manifold. The dimension of this manifold, 
therefore, equals 3 - 2, or 1. 

On the other hand, consider the function G : (R^ defined by: 

G(Xj^,X2,X3> = X^ + X 2 - Xg 

The zero set of G is a cone. But note that the rank of G* is 0 at ( 0 , 0 , 0 ). Thus, 
at this point the cone doesn’t satisfy the condition for the implicit function 
theorem, our definition of a manifold, or our intuitive view of its being locally like 


To exercise our second definition of a manifold, let us consider 
charts defined as follows: 


again with 


Vi 

= {(Xi,X2,X3) 

: X3 1} 


CM 

> 

= { (X3^,X2,Xg) 

: Xg ^ -1} 


a^Cx^.Xg.Xg) 

= (U2»U2) = 

/ -X 

^2 

U - Xg ’ 1 

- ^3 

a2(X2,X2,Xg) 

= (Vi,V2) = 

/ -X 

X2 . 

VI + X3 ’ 1 

+ X3, 
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Then the set 


A = {(V^,a^),(V^,ap} 

satisfies the first condition for an atlas. Consideration of the geometry of the 
stereographic projection shows that bti and > with their domains restricted appro- 
priately actually map onto and are one to one. Also, 

ctiCV^ n V 2 ) = - {(0,0)} = a^(V^ n V 2 ) 


where - {(0,0)} means that the point (0,0) has been deleted from (R^. Further- 
more, after calculating the inverse; 

/ 2u^ 2 u 2 + u| - 1 \ 

CUj.U,) - + „| + 1 • + 1 • + 1 ) 


one can see that: 


1 / ^2 \ 

■ = (Vi,V2) = -T » 2 - - 2 ) 

\Uj + U^ Uq^ + U2 / 


Since its partial derivatives exist and are continuous whenever (ui,U 2 ) ^ (0,0), 
^ 2 ^ 1 ^ is a C“ dif feomorphism. That a similar calculation yields the same conclu- 
sion for confirms that A is an atlas. 


Thus, satisfies our new definition of a manifold. In fact, a little thought 

makes one realize that any space that is a manifold by the first definition is also 
one by the second. Furthermore, anything given as a manifold by the implicit function 
theorem satisfies both definitions. 


Another, trivial but important, example of a class of manifolds is afforded by 
any open subset of (R^. There the atlas may consist of the set itself together with 
the identity map. Thus, the notion that manifolds are spaces that locally look like 
open subsets of (R^ is at least self-consistent. This example is important because 
the whole idea of the definition of manifolds is to be able to see how calculations 
valid in (R^ carry over into any other manifold. 

Another example of a manifold, which is an open set of Euclidean space and which 
is important in systems theory, follows. Let 

X = Ax + bu 


be a single-input controllable system. Recall that controllability is equivalent to 
having the rank of the matrix 


[b,Ab,A^b A^“^b] 

equal to n where A is an n x n matrix. Now let M be the set of pairs (A,b) 
such that a system is controllable: M - {(A,b) : x = Ax + bu is controllable}. 

The complement of this set is the set that satisfies the condition: 

det[b,Ab,A^b, ..., A^'^b] = 0 
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Since this is a closed set in (R 
a manifold. 


, the set M is open in <R 


and therefore is 


n^+n 


n^+n 


The system, being of single input, is a special case. In general, when the con- 
trol distribution function B is an n x m matrix, M* is also a manifold where M* 
is the set: 


M* = {(A,B) : X = Ax + Bu is controllable} 

Although the conditions are more involved and less easy to describe than the deter- 
minant condition above, a similar argument shows that the controllable pairs are an 
open subset of 

A more general example along these same lines is the set of triples of matrices 
(A,B,C) representing the system 


X = Ax + Bu 
y = Cx 

If the system is controllable and observable, it can be shown that this set of triples 
is also an open subset of a suitable Euclidean space. 

Related to this manifold is a set of matrix transfer functions T(s). These are 
matrices of rational functions that arise as the Laplace transforms of the above sys- 
tems. Whether this set {T(s)} is a manifold is a deep question in systems theory. 

It has been answered affirmatively by Martin Clark (ref. 5) and Roger Brockett 
(ref. 6), and independently by Michiel Hazewinkel (ref. 7) and by Christopher Byrnes 
and N. Hurt (ref. 8). Much of the study in linear systems is involved with various 
properties of this manifold. 


Manifold Maps 

We have described manifolds and seen a few examples of them. Now we can describe 
the requirements on functions that allow them to be maps between manifolds. 

A function f. 


f : M ^ N 

is a manifold map if for every x G M and chart (V,a) with x G V, there is a chart 
(U,3) for N with f(V) C U such that the composite function gofoa”^: 

3ofoa"^ : a(v) -> 3(U) 

is a C" diffeomorphlsm. The relations are illustrated in sketch (c) for f = Aj^, 
the map in the following example. 

As an example, let M = N = S^. Let A be a matrix such that I = A^^A, an 
orthogonal transformation. Then, if x G S^, 

I Ax I ^ = (Ax) • (Ax) =5 x^A^Ax = x^x = 1 
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Sketch (c) 

Thus, A maps onto S^. Consider a specific A, namely A^^; 



and let a and 6 be the previous stereographic projection a^. Those relations 
become for a: 



10 




X2 ^ ^2 

U = 

^ 1 - X3 1 “ X3 

Both a and 3 are C“ functions on their respective domains. The composite map 
(vi,V2) (u^ju^) is also clearly C°°. The map , therefore is a manifold map. 

Thus, orthogonal transformations are manifold maps. They also form a manifold. 
The set of n x n orthogonal matrices form an (n/2)(n - 1) dimensional manifold. 

The truth of this statement will be verified by using the implicit function theorem. 

The defining relation of an orthogonal matrix can be used to give a function of 
the matrices into the zero set: 

f(X) = x\ -1=0 

It must now be shown that the rank of the derivative is constant over all elements of 
the set. The derivative at X = A can be found through the definition: 

£im = -jj^ [f(X + H) - f(X) - f'(X)(H)] = 0 

IlHiKo 

where the differential f'(A)(H) recognizes that the derivative evaluated at A is a 

linear operator on H. Performing the expansion and considering the limit gives for 

the differential: 

f ' (A) (H) = + A^H 

Now A is invertible by definition and therefore, in the n x n case, maps one-to- 
one onto . Any matrix H is thus the image of some matrix under A, and one can 

write H AH. The derivative then gives: 

f (A) (AH) = h'^ + H 

Thus, the range of the derivative is seen to be the set of symmetric n x n matrices, 
which implies that it has the constant rank (n/2) (n + 1). Following the considera- 
tions of the implicit function theorem, the orthogonal n x n matrices form an 
(n/2) (n - 1) -dimensional manifold. 

A final consideration for this section is that of forming manifolds from the 
cartesian products of manifolds. If we have a manifold M with atlas A, we can 
construct a new manifold: 
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M X M = { (X,Y) : X,Y G M} 


from M and A. The charts are constructed from the charts of A in the natural way 
as products, i.e., if and charts in A, then a chart for M x m 

is given by (y^ ^ ^2 ’ ^ ^2^ where 

(a^ X a 2 )(X,Y) = (a^ (X) ,a^ (Y) ) 


TANGENT SPACES 


The previous section defined manifolds and gave several examples of them. This 
section considers a basic construction of one manifold from another. While the 
method of construction itself is of interest insofar as it illustrates general proce- 
dures of modern differential geometry, the particular result, the tangent space, is an 
object of great importance: it is by way of the tangent space that calculus can be 

done in general situations. 

To gain familiarity with the idea of a tangent space, it will be worthwhile to 
spend some time with an example, that of the tangent space to the sphere. The infor- 
mation in the previous section concerning charts for the sphere will allow charts to 
be constructed for this new space. The atlas resulting from the construction will be 
examined in the light of the earlier definitions to see that this tangent space forms 
a manifold. The example is useful, too, for giving insight into such things as the 
dimensionality of a tangent space and the fact that its maps preserve its linear and 
differentiable structure. Part of the problem of constructing an atlas is that a map 
must be inverted and that its composition with another map is a dif f eomorphism. 
Reducing our example from a sphere to a circle will simplify this calculation 
considerably. 

Next, still preparatory to considering the general construction of a tangent 
space, the notion of equivalence classes of curves on a manifold, and their addition 
and scalar multiplication will be explored. This study provides the guide to the 
constructions that follow, and to the confirmation that the tangent space is a 
manifold . 

The rest of the section will be devoted to the tangent space in general. It will 
be seen to be a manifold whose charts and chart maps are derived from those of the 
underlying manifold. It will be seen to have vector space properties. Similar 
properties of maps between tangent manifolds will be examined. The differentiating 
properties of these induced maps will be noted. 


The Tangent Space of S^ 

Consider an object moving on S^, the surface of a sphere in space. As it moves, 
it generates a velocity vector which is tangent to the sphere at each point. Since it 
is tangent, the velocity vector lies in the tangent plane at each point and moves 
continuously through tangent planes as the object moves smoothly along the sphere. 

The concepts of a set of tangent planes and of smoothly transitioning from one to 
another is made precise by endowing the set of tangent planes together with their 
points of attachment with a manifold structure. That the set of tangent planes thus 
generates a manifold will be shown in two ways. The first uses the implicit function 
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theorem. The second constructs charts and chart maps explicitly and shows that the 
chart maps have the requisite properties. This explicit construction reveals the 
relation of tangent spaces to derivative operations on the manifolds from which they 
are obtained. 


Let the sphere, , be described by the equation 

+ x^ + X? = 1 


and let the charts and chart maps be as given in the previous section. The tangent 
plane to at x = (x 3 ^,X 2 »X 3 ) is the set 

"^x == Hxi + yi,X2 + y2*Xg + yg) : x^y^ H- X2ya + Xgyg = 0} 

This is the space of vectors orthogonal to the radius vector (xi,X 2 ,X 3 ) and trans- 
lated to the origin. The manifold structure is obvious since there are two equations, 
one for the sphere, and the other for the tangent plane. That is, the tangent space, 
T(S^), is just the following set of points of (R® : 

{(x,y) ; x^ H- x§ + Xg - 1 = 0 ; x^y^ + X 2 y 2 + Xgyg = 0} 

The Jacobian matrix, therefore is: 


”2xi 

2X2 

2x3 0 

0 

0” 

Ji 

ya 

ys 

X2 



Since not all the x^ vanish simultaneously, the rank of the matrix is 2 everywhere. 
The dimension of the manifold, therefore, is 4, corresponding to 2 degrees of freedom 
on the sphere, and 2 additional degrees of freedom on the plane tangent to the sphere 
at any point. 

Now, if the object were moving freely in space and not constrained to the 

surface of the sphere, it would have three degrees of freedom of position. Its 
velocity vector, being unconstrained, would also have three degrees of freedom. As a 
manifold, then, the tangent space would be made up of two copies of . That is to 
say, the tangent space of (R^ , T((R^), is the cartesian product space (R^ x . If the 
object is considered to move in some open subset U of <R^ , then the space needed to 
describe all its possible positions and velocities is U x (R^ , In this case, 

T(U) = U X (R3. 

Return to considering motion on the sphere. From the chart maps of the previous 
section we know that is locally like <R^. It would be reasonable to suppose, 

then, that the tangent space of should look locally like 61 ^ x 6?^ as will be 

confirmed by the calculations investigated in the following paragraphs. The calcula- 
tions will consider a curve, c, on the sphere. To learn what is to be meant by the 
tangent, or velocity, of the curve at some point on the sphere, we will calculate the 
velocity of the image of the curve in (R^, where we know what a tangent to a curve is. 
The image of c in (R^ is obtained through the chart map, of course. Inspecting the 
form of the calculation of the tangent to the curve in will show that it is the 

image of the tangent of the curve in the manifold under a mapping given by the deriva- 
tive of the chart map. This inspection leads to the definition of the tangent to the 
curve on the manifold, and to the isolation of the map that maps it into 
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Let c be a curve on the sphere; that is, c(t) is the position of a point at t, 
where t is an element of an open interval in and t = 0 is likewise in this 

interval. If it isn^t, define a translation of so that the zero does occur 

there. Furthermore, define the translation so that c(0) = x^. Let (U,(j)) be a chart 
containing c(0). The chart map transforms the curve c(t) by the composition <p .o c 
into a curve in (R^ . The velocity vector of <p o c is just the derivative with 
respect to time: (d/dt)(j)t> c. It would be reasonable to expect that a chart map of 
the tangent space, T(S^), map the velocity vector of c into the velocity vector of 
(j) o c. Now, the derivative of (j) o c at t = 0 is just: 

^ (<(> ° c) (0) = (<}. o c) ’ (0) = <P' (Xq)c’ (0) 

since the derivatives in question are defined. Thus, the velocity vectors c' are 
mapped by the derivative (f)^ of the chart map to elements of the "local” tangent 
space ((|)o c)'. Take, for example, the chart map (p: 


(|>(x 


• r^) 


then (j)' is the matrix: 


(fl' (x^.X^jXj) = 


Xi 


1 - X, 


0 


0 

1 


(1 - x,)‘ 


1 - X. 


^3 (i - Xg)' 

A reasonable candidate for a chart in the tangent space T(S^) is then [T(U) ,T<j)(x,y) ] : 
T(U) = {(xi,X 2 ,X 3 , 71 , 72 , 73 ): x^ - 1 = 0 ; x *7 = 0 ; X 3 1} 


T(j)(x, 7 ) = [c|)(x) , (()’ (x) 7 l 

f ^1 ^2 yi 72 _ ^2y3 ~j 

~ - X 3 ’ 1 - X 3 ’ 1 - X 3 (1 - Xg )2 ’ 1 - X 3 (1 - Xg)^] 

It can be seen that T(S^) belongs to (R^ x (R^ , as expected. 

Similarly, another tentative chart for T(S^) can be derived from the other 
chart for S^: 

T(V) = {(xi,X 2 ,X 3 , 7 ^, 72 , 73): x^ - 1 = 0 ; x*y = 0 ; X 3 ^ -1} 

T (x,y) = [y(x) ,y' (x,y)] 

_ (■ ^2 7l Xi7g 72 X273 -j 

“ [1 + Xg ’ 1 + X 3 ’ 1 + X 3 ■ (1 + Xg)2 ’ 1 + X 3 “ (1 + Xg)2j 

For proof that these charts and maps form an atlas, it must be shown that the 
union of the sets T(U) and T(V) covers the tangent space, and that compositions of 
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one map with the inverse of the other over common domains of definition lead to C®® 
maps. It is clear that the first condition is satisfied, that T(S^) = T(U) u T(V). 

The remainder of the argument is tedious and will be carried out only for the circle, 
for which it will be seen that [T(U),T(})] and [T(V),Ty] actually form an atlas. 

Although computing the composition map of, for example, T<()o (Ty)”^ (a,b) is tedious, 
it is important to note that the outcome has the form 

T<f)o(TY)“^(a,b) = [4>oY”^(a,b) ,L(a)b] 


where L(a) is an invertible linear transformation for each a in the domain. Thus, 
the composite map preserves both the differentiable structure of the tangent space 
and its linear structure. 


The computations are particularly simple for the circle, S^, which will be 
obtained by restricting to the plane = 0. will be mapped onto 6?^ by 

<p and y where the image of x = (x 2 ,Xg) will be ui and Vj^ , respectively. The image 
of the tangent in S^, y = (y^^Ya)* will be U 2 and V 2 , respectively. is the set: 

{(x,y): x^ + X3 - 1 = 0 ; X2y2 + X3y3 = 0 } 

Under the chart maps <p,y: 


X, 


X. 


“1 - ■ v(S) - — 


X, 


The Jacobian matrices at (x,y) are: 


<!> 


■<"> ' (t^ • - (t^ • 7T^) 


(1 - X3 

giving the tangent vectors in 


U 2 = <('' (x)y = , _ „ + r 

^ ^3 (1 - Xj)^ 




XjY 


; V 2 = y' (x)y = 


3 


X 


3 (1 + X 3 )- 


To compute <\> ^(u^^) and y ^(v^) requires use of x^ + Xg - 1 = 0. One finds: 


.-1 


2u^ 


2v. 


uf - 1 1 - V? 


^2 = — 


Uj + 1 v^ + 1 


; X 3 = ^ 


+ 1 v^ + 1 


for points in the domain common to <j) and y* 

To compute (<|)’ )~^ (u^ .Ug) and (y' )~^ (v^ .v^) requires use of both x| + x| - 1=0 
and X 2 Y 2 + X 3 y 3 = 0. One finds: 

-2u2(u| - 1) 2v 2<1 - v^) 4 UiU2 “^■^1^2 

(u^ + 1)2 (v2 + 1)2 ’ ^3 (u2 + 1)2 “ (v2 + 1)2 
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Finally, the composition Tc})o(Ty) ^ (v) = (f>* «> (y* (v) ] can be found 

to be: 



These expressions give a dif feomorphism, since the point = 0 is excluded as 

not being in the images of U n V. Note that U2 is given by a linear transformation 
at V. 


Equivalence Classes of Curves 

The example of a tangent space started with a sphere and a plane tangent to the 
sphere at a point. This plane was seen to contain the tangent to the velocity of a 
curve passing through the point of attachment. As a matter of fact, the plane is the 
locus of tangents to all the curves on the surface at the point, a curve being a map 
of an interval of into some region of the manifold. The interval of the real 

line is so adjusted for discussion that t = 0 corresponds to the point p of the 
manifold: c( 0 ) = p. The curves can be grouped into classes. Being a tangent vector 

is taken to be a class property, and the tangent plane can be determined by a set of 
independent tangent vectors. 

The classes are equivalence classes. Two curves are in the same class if they 
are equivalent to each other. They are equivalent if they pass through the same 
point and have the same velocity there. The velocity is measured in the local 
Euclidean frame given by the chart map attached at the point. Thus, the curves 
CjCt) and C2(t) are equivalent if 


This equality is also written as 

(^oc^)^ (<P(p)) = ( 4 >(p)) 

The set of all curves equivalent to c at p is denoted by [c]p. This S3anbol of the 
class of curves includes the point of attachment as well as the tangent vector. 

Note that a particular chart was used in the definition of equivalence. It must 
be shown that the definition is independent of the particular chart used. This inde- 
pendence is something that must be routinely verified in almost every definition of 
differential geometry. In this case, as often, the verification requires just a 
routine manipulation of derivatives. Suppose (V,i|^) is another chart at 
p = Ci( 0 ) = C2(0), and that c^ is equivalent to C2: 

((poCj^)' (<p(p)) = ((poc^)^ (<p(p)) 

Then applying the chain rule of differentiation symbollically: 
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= (<l>°Ci(0))o((fioCi)' ((f)(p)) 

= ((\polp~^)' (l))oC2(0))o((j)oC2)' (l))(p)) 

= (l|^o({)“^o(f)oC2) ' ((|)(p)) 

= (lJ;oC2) ' (<|)(p)) 

Thus, the definition goes through with any chart map. 

The set of tangent vectors at a point, {[c]p}, can be seen to form a vector 
space once it is understood just how addition of curves and their multiplication by a 
constant works. The operations act on the derivatives: [calp = a[ci]p means 

((()oc2)* ((j)(p)) = a(cj)oc3^) * ((|).(p)) , according to the definition. The common point, 
p = Ci(0) = C2(0), remains fixed. 

Consider two curves c^Ct) and C2(t) on a manifold, M, with and C 
such that t G Cj^(t), t G I2 C2(t), t=0GIj^ni2?^(() (the zero of time 

occurs in the common interval which is not empty). Suppose G [c^^lp and 
C2 E [c^lp, and the question is how to add them; that is how to define (c^ + C2)(t). 

The meaning of addition and scalar multiplication of curves is clear when the 
operations are defined in the local cartesian space. The definitions come out most 
easily when the chart maps map the point p of the manifold into the origin of the 
local < 1 ^: ())oc( 0 ) = 4 ^(p) = 0 . Under these conditions, ((>003^ and <f)oc2 are curves at 
0 in Hence, 

(<|)0Ci + (f)oC2): Ii n I 2 

Is also a curve there, and <() ^(({ioCq^ + 41002) Is a curve at p. Then addition is 
defined by defining [c^lp + [c2]p to be the equivalence class [4>~^(4>°Ci + 4>°C2)]p: 

[Clip + [C2]p = [4>“^(4>oC;L + 4>oC2)]p = [Ci + C2]p 

That the identification of the sum of equivalence classes of curves as the equiv- 
alence class of the sum of curves is well defined, that is, is independent of the 
particular curves chosen from the class, is easily shown. Let Ci, bi G [c^] , and 
C2» b2 E [C2]p. 

^ ( 4 >oCj^ + 4 ) 0 C 2 ) (0) = ^ 4 >oCi( 0 ) + — 4 ioC 2 ( 0 ) 

= ^ 4 >°bi( 0 ) + ^ 4 >°b 2 ( 0 ) 

= (4>°bj^ + 4>ob2)(0) 

Then 

[4>“^(4)oCj^ + 4>oc2)]p = [4>“^(4)obj^ + 4>°b2>]p 
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showing that addition is well defined. Multiplication by a scalar is also well 
defined . 


The Tangent Space in General 

The set of all equivalence classes of all curves passing through the point p 
of a manifold is said to be its tangent space there: 

T (M) = {[c] , for all c(t) e M with c(0) = p} 

P P 

The collection of tangent spaces of all the points of the manifold is called the 
tangent space of the manifold: 


T(M) = UT (M) 

p P 

This tangent space is itself a manifold with a structure that maintains both the 
linear structure of the equivalence classes of curves and the differentiable structure 
of the manifold from which it comes. 


During the discussion of the tangent space to the sphere, it was mentioned that 
any open set of it had the structure of the cartesian product U x where U was 
an open subset of the sphere, with the local appearance of x This product 

structure of the tangent manifold is general and is understood to follow from its 
definition as a union of tangent spaces when the nature of the point [c]p in general 
is recalled. Since [c]p is the class of curves at p, it can be written (in the 
appealing form of a Taylor expansion, but with some equivocation of addition): 

[p + c^(0)t]p. It is specified by the vectors c(0) = p and c'(0), and can be written 
as [p,c^(0)]p. Thus, the identification of UTp(u) with U x follows from iden- 

tifying [c]p with [p,c'(0)]p. P 

Again, locally, one chart map for T(S^) was given as 


T<))(x,y) = [<|)(x) ,(}>' (x)y] C (R^ x (R^ 


which in the present notation is [0,(|)^c' (0) ] . To show that this form holds generally, 
let (U,(()) be a chart in M. Being an open subset of M, U is a manifold. Thus, 

T(U) is a well defined subspace of T(M). The corresponding map, T(J>, maps T(U) into 
T(<j)(U)). That is to say, 

THclp = 

Taking c(t) G [c]p in the form c(t) = c(0) + c^(0)t, one can expand <j) similarly: 

(l>(c(t)) = (Kp + c'(O)t) = (|)(p) + (|)’c'(0)t 
for small enough t. Since (cj)oc)’(O) defines the equivalence class, one has 


as c laimed , and : 


T<f[c]p = [<(>°c]^(pj 



T<j>:T(u) ->• T(c|)(u)) = (J)(u) x 


Thus, the intuition obtained from the discussion of the sphere holds. 

For T<(> to be a map, it must be invertible, or, equivalently, "one-to-one 
onto." The identification of T((|)(u)) with x shows that it is onto. To 

verify that it is one to one, suppose that 

T(t>[c]p = T(J)[b]^ 


which was just seen to mean 


This implies that <()(p) = 4>(q)» and since <() is one to one, p = q. This fact, 
together with (<|)oc)’ = (4>°b)' and the definition of equivalence, show that 
[c]p = [b]q. Hence, T(J) is one to one and onto an open set. 

To investigate the compatibility and differentiability requirements of chart 
maps, let (V,y) be another chart and chart map in M. Construct 

Ty:T(V) y(V) x 

and assume U U V {0}. The map T(|)o(Ty)”^ is found as follows. Let 
(a,b) e y(v) X Then 

T(()o(Ty)‘“^(a,b) = 4>°[y^^°(a + bt)]^_i^^j 


(Ky-^(a)), ^ 


:»y“^o(a 


+ bt) 


t=o 


= ^(a)),((j)oy ^)’(a)b]^ 

Since we are assuming <()oy“^ is- C°°, T<j>o(Ty)”^ is also a C“ function. Thus, we 
have an atlas for T(M) whose charts are derived from those of M and are given by 
(T(U),T<|)) and the like. Furthermore, note that the composite map is differentiable, 
and since is a linear map, it also respects the linear structure of the 

tangent space. 


Mapping Between Tangent Spaces 

Our discussion of mapping between tangent spaces need not examine the require- 
ments put on the maps themselves. They are the same as those between manifolds in 
general, and were found in the previous section. They were very similar to the 
requirements of chart maps, to which, of course, they must reduce when the map between 
the manifolds is the identity. 

This same resemblance of the behavior of manifolds under chart maps between the 
global and local manifolds and the behavior of the tangent spaces under chart maps 
will be seen in the two properties to be looked at now. One of the points to be made 
is that the tangent manifold map contains the manifold map and its derivative in the 
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same way that the tangent manifold chart map was seen to contain the manifold chart 
map and its derivative. The other point to be made is that the vector space structure 
of the tangent manifold is preserved under the maps. This property is due to the 
linearity of the derivative map. 

The map between tangent spaces contains the map between manifolds and its 
derivative. These results can be obtained very quickly in just the same way that the 
same result was seen to hold for the tangent manifold chart maps. For simplicity, 
consider M and N to be open subsets of Euclidean space and f; M N a manifold 

map. It was noted before that the equivalence classes of curves in M are repre- 
sented by linear functions; that is, 

[c] = [p + c’(0)t] 


Now if f: M ^ N then 

Tf[p + c’(0)t]p = [f(p + c’(0)t)]^^p^ = [f(p) + tf (p)c’ (0)]^^p^ 

The map of a particular curve, then, can be written as: 

Tf(p,c'(0)) = (f (p) ,Df (p)c' (0)) 3.1 

For fixed p, then, the linear map is just Df(p), and Tf contains both f and the 
derivative. 

o 

Consider for an example, a rotation of the sphere. Let M = N = S . If f is 
a rotation of S^, it can be represented by an orthonormal matrix, say, f(x) = Ax, 
and AA^ = I. Let c be a curve on with c(0) = p = (xi,X 2 ,X 3 >. If 

c' (0) = (yi>y 2 »y 3 )> then c determines the point in the tangent space 

c = ,X 2 jXj ,yj^,y 2 ,ys) . Note that the definition of the tangent space T(S^) 

requires that pTc' (0) = Xj^y^^ + X 2 y 2 + ^ 3 y 3 = 0, which will also have to hold in the 
image space after the rotation. 

The rotation f determines a new curve foc at f(p), and 

(foc)’(O) = (Ac)'(O) 

The tangent space map, then looks like 

Tf(p,c’(0)) = (Ap,Ac'(0)) 

where A - Df(p) since a rotation is a linear map. To verify that the image of the 
map is also T(S^), note that 

(Ap)^Ac’(O) = p^A^Ac'(O) = p^c'(O) = 0 


as expected. 

To examine the addition property under mappings of the tangent manifolds, again 
consider the manifold map f:M -> N. Let (U,<f)) be a chart in M and (V,y) be a chart 
in N. Assume that f(u) C V (otherwise, take U' to be f“^(V) n U) . We will 
examine the action of Tf on a single tangent plane Tp(M). 
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Let p G U, We have an addition defined on Tp(M) with respect to the chart 
(U,4)) and an addition defined on Tf(p)(N) with respect to the chart (VjY). We want 
to compare Tf([ci]p + [c 2 ]p) with 'Tftcilp + Tf[c 2 ]p with the hope that they are 
equal. The argument is typical: we reduce each expression, using definitions, until 

something rather obvious appears to connect terms: 

Tf([Ci]p + [C2]p) = Tf [(|)“^o((})oCi + <j)0C2)]p 

= [f o({)"^o((j)oCi + (f)oC2)]j^p^ 


and 


Tf[c,]p + Tf[C 2 ]p = [f°C,]f(pj + [foC 2 ]f(p) 

= + Y°f»C2)]f(pj 

By definition, the final two equivalence classes are equal if and only if 


yofo(j) ^o((|)oc^(t) + <^oc^(t)) 


t=o ^ dt (‘Y°f°C 3 ^(t) H- yofoc^(t)) 


t=o 


Now calculating the derivative on the left yields 


^ yofo()) = (yofo(j) ' (0) [ ((f)0C^) * (0) + ((|)oC 2 ) ' (0) ] 


= (yofo(f)-i)’ (0)((|)OC^)’ (0) + (yofocf)”^)' (0)((j)oC2)’ (0) 


= (yofo(() ^ 0{()0C j) ’ (0) + (yofo<j) ^ o(j)oC2) ’ (0) 


= ^ [Y°f°Ci(t) + Y°f°C2(t)] 


t = 0 


Thus, we have the important fact that the addition goes through so that the restric- 
tion of Tf to the tangent planes is a linear function. 

We might remark parenthetically that tangent spaces are a special case of the 
more general concept of vector bundles . A vector bundle is a triple^ of objects 
('ir,E,B) where E and B are manifolds, tt is a manifold mapping of E onto B, and 
TT”^(b) is a vector space for each b G B. In the case of tangent spaces, B is the 
manifold, E is the tangent space T(B), and tt is the map defined by Tr([c]p) = p. 
The vector space T(B) is called the fiber of E over b. The map 7 t“ is called a 
cross section when it is one-to-one and onto. 


All the words and worries of this and the previous sections should not obscure 
what are basically simple concepts. The wealth of discussion and terminology aims at 
separating the many ideas growing close together as topics in geometry and analysis 
grow. A simple example and some diagrams may help keep the reader aware of the whole 
topic as details are described. 

In a typical elementary discussion, the derivative of a polynomial might be shown 
as (d/dx)(x^) = 2x. A later discussion would say that differentiating the function 
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X gives the function x ^ 2x. This statement can be shovm diagrammatically like 

sketch (d) 




f 2 


OR 


^2x 


df i 


2x 


Sketch (d) 


The advantage is that the function can be viewed as an object having a structure and 
operations of its own. Though this looks more complicated than necessary, it does 
help resolve details of what is happening (ref. 9). 


Similarly, the relationships of a mapping between manifolds, of their tangent 
spaces, and of the mapping between the tangent spaces induced by that between the 
manifolds can be illustrated like sketch (e) , 


f 



which shows the relation between the manifold map, f: M ■> N and the map thereby 
induced between the tangent spaces, Tf: T(M) ^ T(N). Looking back over the discussion 
of charts, one can see that the same sort of diagram illustrates the same relations 
between a chart map to the local and that induced in the tangent space 

(sketch (f)). 

The maps X and Y in sketches (e) and (f) relate the manifolds M, N, and 
to their respective tangent spaces. The discussion of these relations is the task of 
the next section. 
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VECTOR FIELDS AND THEIR ALGEBRA 


So far, manifolds and their tangent spaces have been defined and discussed. It 
has been shown that the tangent spaces are related to velocities. Their fundamental 
relation to differential equations has been hinted at. This relation will become 
explicit in this section which examines vector fields as entities that relate tangent 
spaces to the manifolds underlying them. 

Defining vector fields this way is a modern choice among alternatives. They 
actually have many familiar connections. Once one has started from one definition 
and from the fundamental properties coming directly from this definition, then the 
properties coming from the other connections have to be established. To help estab- 
lish these properties, the notion of "derivation" — the operation of taking deriva- 
tives — is brought in. This notion is first given abstractly, then interpreted in 
terms of the Euclidean plane. Some of its properties are easy to obtain. These are 
related to those of a vector field by showing that there is an isomorphism between 
spaces of vector fields and spaces of derivations. Finally, it is an important fact 
that vector fields not only form a space, but also an algebra. The last part of this 
section contains a discussion of a multiplication rule (symbolized by brackets) and 
some of the properties of the algebra coming from the multiplication, and an example 
using the linear state equations of control theory. 
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Vector Fields 


Let M be a manifold and T(M) its tangent space. A vector field, X, is a mani- 
fold map from M to T(M) such that for every p G M, the vector field at the point p 
gives a point in the tangent space attached to the manifold there: X(p) G Tp(M). 

Recall that each point, t, on the smooth curve c(t) has a velocity vector 
belonging to a class with tangent vector c'(t), and that Tp(M) is the set of equiva- 
lence classes of curves through the point p, at t = 0. A good choice of scale, 
then, gives c*(t) as the point (t , d/dt(t)), or, (t,l). Now consider the curve c 
on M with domain I as a map between manifolds; namely, c: I -> M. Let c’ (t) 
denote the curve in Tp(M): c' (0) G [c]p. As shown in sketch (g) , there is also an 
induced map Tc between the tangent space to I, I x (R, and T(M): Tc: I x (R T(M). 


c 



Sketch (g) 

The- commuting properties shown in sketch (g) indicate that c’ (t) is the image of the 
point (t,l): 


c’(t) = Tc(t,l) (4.1) 

Now the definition of vector fields says that c' (t) is the image of c(t) G M by the 
vector field X: 


c’(t) = X(c(t)) (4.2) 

To look at this relation in more detail, consider it in local coordinates where 
M can be taken as an open subset of (R^ and its tangent space, T(M), a subset of 
(R^ X (R^. Equations (4.1) and (3.1) on page 20 give the tangent vector c’ (t) as: 

c’(t) = Tc(t,l) = (c(t), Dc(t) • 1) 

= (c(t), c(t)) (4.3) 

Now we can write X(c(t)) in the form: 

X(c(t)) = (c(t), X(c(t)) (4.4) 
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A comparison of equations (4.2), (4.3), and (4.4) shows that: 

^ c(t) = X(c(t)) (4.5) 

This is a system of differential equations. 

Equation (4.5) is important. It gives a conceptual identification of a moving 
point with a function of position. It shows that the thrust of the definition of 
vector fields is that the manifold is the locus over all initial conditions of all 
the curves whose tangent vectors are given by equation (4.5). They are the "integral 
curves" of X. 

The entire local theory of ordinary differential equations applies to the system 
of equations in (4.5). Its study in terms of manifolds gives rise to some global 
results. One example of a global result is given by the following theorem which will 
be used in a later section: 

Theorem: Let M be a compact manifold, X a vector field on 

M and c: I M an integral curve of X. Then the domain of c 
can be extended to (R. 

One interpretation of this theorem, for example, is that there is no finite escape 
time for solutions of differential equations on compact manifolds, which means that 
the solutions are well behaved over any finite time interval. 

Derivations 

Having defined vector fields and shown that they determine the "right-hand side" 
of ordinary differential equations, we turn to the concept of derivations that later 
will be shown to be equivalent to that of a vector field. The concept is useful also 
because it Is usually easier to perform the derivations than to calculate the veloc- 
ity vector directly. 

Let M be a manifold and let F(M) be all the real-valued functions that map 
M into fl. Since when any of them are evaluated at a given point they are just real 
numbers, any two functions f,g € F(M) can be added and multiplied pointwise: 

(f + g)(x) = f(x) + g(x) 

(fg)(x) = f(x)g(x) 

Since the right-hand sides are familiar operations on real numbers, they define the 
symbols on the left. (These definitions, together with statements about multiplying 
by constants and associative properties, make the set of real functions F(M) into a 
"ring." An alternative approach to our development of manifolds uses rings of func- 
tions defined on open sets rather than charts because knowing F(M) you know M. ) 

A derivation, 0, on F(M) is defined as a function that maps F(M) to F(M), with 
the following properties: 


(1) 

0 (cf ) = C0 (f ) , 

c e fi 

(2) 

0(f + g) = 0(f) 

+ 

CD 

OQ 
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(3) 0(fg) = 0(f)g + f0(g) 


Properties (1) and (2) say that 0 is a linear map on F(M). Property (3) shows 
that the operation is not linear when the coefficient of a function is not a constant 
but another function. It looks suspiciously like the usual derivative of the product 
of functions. It should come as no surprise, then, that these three properties defin- 
ing the derivation operation, 0, abstract the usual idea of a derivative. 

Now derivatives are differencing operations that need to have three objects 
specified before they give a value, say, a real number. They need a function to work 
on, a location at which the results can be evaluated, and a heading away from the 
point of evaluation. The statement 0f(x), or 0(f) (x), gives a real number. The 
function f and the point of evaluation, x, are explicit. Implicit in 0, then, are 
the notions of limits of differencing and of the direction of differencing. 

Consider an example. Let M be the Euclidean plane: M = <R^. Then take F((R^) 

as the set of all real-valued functions, with continuous partial derivatives of all 
orders. Let 0 be given by: 


0 



Then 


and, by the ordinary rules of calculus, one has 

0(fg) = 0(f)g + f0(g) 

A more general example of a derivation for f e F((R^) is 

a(f) = a^Cx.y) ||- + ||- (4.6) 

where and also belong to F((R^). The direction of differentiation is 
In fact, every derivation of F((R^) has the form of equation (4.6), as is seen from 
the following. 

Consider a function f e F((R^). Then the equation 

f(x,y) = f(a,b) + ^ f(a + t(x -a), b + t(y - b))dt 

o 

is an identity. It can be written as: 
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f 


3f 


f(x,y) = f(a,b) + (x - a) | + t(x - a), b + t(y - b))dt 


+ (y - b) 


r 


M 

3y 


(a + t(x - a, b + t(y - b))dt 


For a fixed a and b, this formula is true for all x and y. Letting 


= J* II 


(a + t(x - a), b + t(y - b))dt 


and 


g2(x,y) 



(a + t(x - a), b + t(y - b))dt 


we can write 

f(x,y) = f(a,b) + (x - a)gj^(x,y) + (y - b)g 2 (x,y) 
Define two projection functions: 

Pi(x,y) = X and P2(x,y) = y 
Then equation (4.7) can be written in function notation as 

f = f(a,b)+ (Pj^ - + (pg - b)g 2 

Now applying an arbitrary derivation, 6, to f gives: 

0(f) = ©(p^g^) - aS(gi) + eCPjSa) “ ^^(§2^ 


(4.7) 


(4.8) 


= 0(Pi)gi + Pi0(gi) - ae(g^) + 0(P2)g2 + PgSCSg^ " b 0 (g 2 ) (4.9) 

since properties (1) and (3) give the condition that 0(c) = 0(b) = 0. Evaluating 
expression (4.9) at the point (a,b) gives 

0(f)(a,b) = 0(pj^) (a,b)gj^(a,b) + a 0 (g 3 ^)(a,b) - a0(gj^)(a,b) 


6(P2)(a.b)g2(a,b) +b0(g2)(a,b) - b0(g2)(a,b) 
= 0(pi) (a,b)g^(a,b) + 0 (p 2 ) (a,b)g 2 (a,b) 


27 



Now 


likewise 


g^Ca.b) 


1 

f f = H <*•« 

o 



dt = If (a,b) 


§2 


9y 


(a,b) 


Thus , we have 


= 0(p,) 0(P2) ^ 


which has the form of’ equation (4.6). 


A Digression on Notation 

The symbol 6 has been introduced. It is about to be related to the symbol X, 
for a vector field, through the symbol Lx» for a Lie derivative. All relate to sym- 
bols used for elements of the tangent space. All will look like a common gradient or 
directional derivative when applied to real functions in ordinary Euclidean space. 
With this much notation being used to emphasize different aspects of basically the 
same set of objects, it seems desirable to digress from the development of ideas to a 
comparison of the forms under discussion. 

Notational problems begin with the expression for the tangent spaces. A single 
tangent space, Tp(M), is a linear vector space attached to a particular point p of 
a manifold. It has the structure of the product of the spaces: M x (R^; that is, a 
tangent vector can be written as (p,x), with p G M and x G All the vector 

operations are done with the second component, but they only make sense if they are 
done at the same point of space: (p,x) + (p,y) = (p,x + y) . Whenever vector opera- 

tions of tangent vectors are discussed, it is assumed that the objects of the opera- 
tion reside at the same point of the manifold, even if this requirement is not 
reverted to explicitly. They are not defined otherwise. 

The explicit expression for the term has been written variously as c’(0), 

(foc(O))’, f*(0)c'(0), and so on, depending on obvious circumstances. Strictly speak- 
ing, these expressions refer to the velocity of the curve at a point in the manifold, 
which is equivalent to a tangent vector in the tangent space, a distinction not always 
kept clear. 

The velocity aspects of this term, which refers to its source in a curve on a 
manifold, may not always be significant; that the tangent lives in is. It might 

be referred to as Df when something generic is meant. It might take the form 
df(m)(m,x) in mapping an element (m,x) from one tangent space to another. This form 
emphasizes the linear operator aspects of df(m). If the original manifold is 
Euclidean, then df(m)(m,x) can be written explicitly as (3f /9x^(m) ,x^) where the 
outer parentheses and the comma denote an inner product. This can also look, like 
(9f /9x^(m) ,a^) , where its connection with the general form of 0f = a^(9f/9x^) is 
clear. This is also the general form for a directional derivative. The same form 
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represents X(f) under these circumstances. It is also the form that a Lie derivative 
takes when it operates on a scalar function. 


Isomorphism Between Vector Fields and Derivations 

It has already been admitted that vector fields and derivations are just differ- 
ent sides of the same coin. Their identification with each other is formalized by an 
isomorphism between them. It will be shown that each vector field on a manifold gives 
rise to a unique derivation of F(M), and the result of every algebraic operation 
between vector fields there corresponds the result of an algebraic operation between 
the corresponding derivations. 

Recall that for f e F(M), the range of the induced map Tf is (R x 

Tf: T(M) T(.(R) = (R x(R 

Also recall that restricting its domain to the neighborhood of T^^(M) 

T f = TflT^(M) 
m ' m' ' 

was shown earlier to be a linear map: 

T f : T (M) {f (m)} x (R 
m m 


Define df(m) by; 


df(m) = p^T^f 

where p^ is the projection onto the second coordinate as defined in equation (4.8): 
P2(a,b) = b. 

When M is an open subset of (R^ so that T(M) = (R’^ xiR^^^ then f is a 
real-valued function of n real variables. The induced map at the point (m,x) of the 
tangent space gives; 

T f(m)(m,x) = (f (m) ,Df (m)x) ^ (R^ x (R^ 
m 

from which the number df(m)(m,x) can be identified: 

df(m)(m,x) = p^T^f (m) (m,x) = Df(m)x 

It can be seen from those expressions that df(m) is the linear operator Df (m) . 

Since df(m) is a linear map from (R^, where it gives a linear functional, 

it is an element of a space dual to T^(M), and can be represented as a row of n 
elements. This row notation is quite compatible with the explicit notation for a dif- 
ferential usually to be found in texts of advanced calculus: 

df(m) = (m), . . (m) (4.10) 

ax" 
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If x^(x) is the ith coordinate function for the^ Euclidean manifold, 

df(m)(m,dx) = (m)dx^ + • . • + (m)dx^ (4.11) 

Bx-^ 9x 

which formalizes the usual expression in advanced calculus texts; 

df = dx^ + . . • + 

3x^ 

The expressions in (4.10) and (4.11) are worth a second glance in light of the 
comment that df(m) is an element of the space dual to the tangent space Tjjj(M). 

Though the terms in the right-hand side of (4.11) are real numbers, their factors come 
from (4.10) and a column of dx^. If the dx^ are considered to be unit vectors in 
the dual space, (4.11) is an element of that space. According to the usual method of 
evaluating the coefficients of a vector or of a dual (or co-) vector, the coefficients 
of (4.11) must come from the action of unit elements in the original vector space to 
which the dx^ are dual. It does no violence to any interpretation of the usual dif- 
ferential expressions to take these unit vectors to be the 9/9x^. 

The above view sees expressions like (4.11) in a new light- It allows the intro- 
duction of another notation, that of the Lie derivative of a real function, f, with 
respect to a vector field X. This is written: 

L^(f)(m) = df(m)(X(m)) (4.12) 

This Lie derivative form will prove to be a notational convenience whose use can be 
interchanged with the differential operator form. Since the two forms shift different 
elements in or out of parentheses and exchange the written order of the factors, the 
choice of form used will be dictated largely by the desire for brevity of notation, 
and, of course, by which of the factors are relevant at the time. 

The definition in (4.12) is pointwise: it holds at each point in the tangent and 

dual, or cotangent spaces associated with the point m. Since L^ is going to give 
the isomorphism between X and 0, Lxf will have to be an element of F(M), and will 
have to be extended from the point definition in (4.12). To show L^f ^ F(M), one 
proves that df is a smooth map. This follows after the cotangent space is proved 
to be a manifold. The details are lengthy and technical. The interested reader is 
referred to reference 3 for the construction of the cotangent space and for the proof 
that Lj^f G F(M). This then establishes that L^ maps from F(M) to F(M). 

To see that L^ is linear, recall what has been shown about Tjjjf and remember 
in particular that Tmf: (M) (R X fi. Equation (4.12) gives: 


Consider 


d(f + g)(m)[c]jjj = 

= Pa^Cf + g)(ni),D((f + g)°c)(0)) 

= D(foc)(0) + D(goc)(0) 

= P 2 (f (m) ,D(f oc) (0)) + p^CgCm) ,D(goc) (0)) 

= P2*V[c]^ + P2-T^g[c]^ 

= (P2*V + P2-T^g)[c]^ 

= (df (m) + dg(m)) [c]^ 

So d is additive and the rest of linearity is easy. 

The same procedure that showed that Lx is a linear operator on F(M) can be 
used to show that it is a derivation: 

P 2 *Tjjjfg[c]^ = P 2 (fg(m) ,D(fgoc) (0)) 

= D(fgoc)(0) 

= D(f oc goc) (0) 

= f oc(0)D(goc) (0) + goc(0)D(f oc) (0) 

= f (m)D(goc) (0) + g(m)D(f oc) (0) 

= f(m)p2-T^g[c]^ + g(m)p2*T^f[c]^ 

Thus, Lx(fg) = fLx(g) + gLx(f)> showing that each vector field, X, determines a 
derivation on F(M). 

Also note that from what we know of the following holds: 


so that, when acting on F(M), L is a linear map from the set of all vector fields 
to the vector space of derivations . 

A linear map from one vector space to another is an isomorphism if it is one-to~ 
one and onto. The present case is one-to-one if L^f = 0 for all f means that 
X = 0. To show that involves showing that every element in the dual space of T^(M) 
is represented by some df(m). The existence proof is a "partitions-of-unity" con- 
struction which can be found in reference 3. Once the dual space is known to be 
represented by some df(m), the proof that L is one-to-one is simple. For, 
consider Lx = 0. Then Lx(f)(m) = 0 for all f in F(M). Then 

df(m)(X(m)) = 0 
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holds and every element ot in the dual of Tm (M) has the property that a(X(m)) = 0. 
Thus, X(m) = 0, and this is true for each m in M. 


The mapping is onto if every derivation arises from the Lie derivative of a 
vector field. Recall that for we showed that every derivation 0 has the form 


3f 


0(f)(m) = a^(m) 


m 


{ ^ 


m 


Consider the vector field X(m) = (m.aj^ (m) ,a 2 (m) ) ; then: 

Ljj(f)(m) = df(m)(X(m)) 


■ ST 


m 


+ a2(m) ^ 


m 


and thus = 9- In the case, then, that M = (R^ , we have proved the theorem: 

The linear space of vector fields on M is isomorphic to the 
linear space of derivations of F(M). 

The general case is similar, essentially showing that on M the theorem can be 
proved locally using the proof given here. 

We have now defined vector fields and have shown that they correspond to systems 
of differential equations on manifolds. We have also seen that there is a one-to-one 
correspondence between the space of all vector fields on M and the space of all 
derivations on F(M). This identification endows vector fields with the derivative 
properties one would expect them to have. 


The Algebra of Vector Fields and Lie Derivatives 
The composition of two derivations is not generally a derivation, for: 
020i(fg) = 02[f0ig + g0if] 

= 02f0ig + + g02®l^ 


(4.13) 


The first two terms spoil the derivation property. On the other hand, note: 

0^02(fg) = 0if02g + 0lg02f + f0102g + 8®102^ (4.14) 

Subtracting equation (4.14) from equation (4.13) gives: 

(0201 - 0i02)(fg) = f(020l - 0i02)g + g(0201 - 0102^^ 

SO that the operation 02 0i ~ 0i®2 is a derivation. This difference operation is 
called the commutator, or bracket, of 02 and 0i, and is represented thus: 

0201 “ 0102 “ [02»®l] 
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Using the Lie derivative form, as an example to show that the bracket of Lie 
derivatives is a Lie derivative, consider a two-dimensional case with 


L f = a^(x) — f + a^(x) — ^ f = 


3x 




and 


L f = b^(x) ^ f + b^(x) ^ f 


9x 


9x" 




Then 


(LL )£ . (a' ^ + a" (b^ ^ + b" f 

V 9x^ 3x^/\ 3x^ 3x2/ 

= l^+a2 

\V 9x^ 3x2 J g^i y 3^1 3^2 J 

+ bi (a^ —f— + a2 + b2 f a^ — ^ + 

\ 9x^9x^ 9x^9x^ / \ 9x^9x^ 


2 ^2 


9x^9x^ 


))■ 


and 


+ a' 


Then 


9x^ 

9x^ / V 3x^ 

3x2 

b2 

-3- + /b 

1 3a2 


3x2 J 3^1 

3x^ 

b^ 

■ + b2 3" 

-)+a 

3x^3x^ 

3x'3x' 

V£- ( 

(a' ^ - b^ 

V 3x^ 

3a^ _l_ 
3x^ 


9x^9x^ 9x^9x^ 


)) 


> (4.15) 


2 9b^ , 2 9a 

— D — 


9x^ 8x^ / 9x^ 


2/ 3x" 


V 3x" 3x" 


^ _ 1.2 3a2 

^ o b n 

Bx-" 9x-^ 3x^ 


)i-) 


(E(E(-‘s--‘S))*)'-(I'->'‘ j)'- 


= L^f (4.16) 


1 1 


Bracketing removes the terms with higher derivatives of f , leaving an expression with 
the proper form for a Lie derivative. 
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The isomorphism which the Lie derivative gives between vector fields and deriva- 
tions means that operations involving derivations have corresponding operations 
involving vector fields. The operation on vector fields corresponding to the bracket- 
ing operation on derivations is also called a bracket and is written similarly. If 
X and Y are the vector fields corresponding to and Ly® then there is a vector 
field Z corresponding to [Lx>Ly] such that Z = [X,Y]. The following theorem holds: 


The bracket operation on the linear space of vector fields forms 
a Lie algebra with the following properties: 

(i) [X,Y] - -[Y,X] 

(li) [X,Y + aZ] = [X,Y] + a[X,Z] , a € ' 

(iii) [X,[Y,Z]] + [Y,[Z,X]] + [Z,[X,Y]] = 0 ^ 


(4.17) 


A Lie algebra is , in fact , nothing more than a set of elements that forms a vector 
space and for which a multiplication is defined by a bracketing operation with the 
three properties in the theorem. 


An Example of a Lie Algebra 

The state space representation of linear constant-coefficient control systems is 
a good source of an example of a Lie algebra. From the control system 

X = Ax + Bu 

we obtain a family of vector fields, one for each constant u. Let Z^ be the vector 
field that acts on a point x by the following: 

Z^(x) = (x,(Ax + Bu)) G Tjj(M) C fl” x 

Suppressing the base point x, let Z^ = Ax + Bu. The example of a Lie algebra will 
involve calculating the smallest Lie algebra that contains all the elements Z in 
Tx(M). Define = (X + U) with U = Bu and X = Ax = Zq. Also from equa- 

tions (4.17), 


[Z^,Z^] = [X + U, X + V] 

= -[X,U] + [X,V] + [U,V] 


where the fact that [X,X] = 0 for any X has been used. Thus, the Lie algebra 
generated by the Z^ is given by the sets {X}, which is a singleton set, and 
{U:U These brackets can be evaluated by calculating the brackets of the 

corresponding Lie derivatives. The derivatives are written like that in in 
equation (4.16): 


and 


L^(f)(x) 
L^(f) (x) 




f 


f 
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Calculating the bracket for [Lu»Ly]f = LyLyf - LyLuf, one sees that all the second 
part^als cancel, as shown in equation (4.15). Furthermore, since the coefficients 
(Bu)^ do not involve x, their derivatives vanish. Hence, [Lu,Ly] = 0. The next 
bracket to consider is [L^jL^Jf, the first term of which is LxLy: 




3x1 


f + S 


= S 


(4.18) 


where S represents the second-order partial derivatives, the other terms vanishing. 
The other term of the bracket is a little more complex (take bj as the elements of 
B and a^ as the elements of A) : 


W - hi " 

i 


EE ^ ^ 


EEEE4^ 

j i k m 


k 3 ,imN 3 ^ ^ \ 

(ax ) — - f + S > 


3x^ 


m 


3X-* 


J i k 


= (ABu)^ f + S 

^ 3x^ 

i 


(4.19) 


Subtracting equation (4.18) from the last line above gives: 


[Lu.L^lf - -tLx.Lulf - ^(ABu)" ^ f E 

i 


The vector field corresponding to Lu(i) will be denoted by: 


= ABu 
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The notation anticipates defining = A^Bu, where A is the kth power of A. 

To be consistent, let replace U. It can be seen by repeating the procedure in 

equations (4.19) that = Lu(k+i)* Hence, one has that 

[U^^\x] = 

The Cayley-Hamilton theorem says that the nth power of an n x n matrix A is a 
linear combination of the lower powers of the matrix: 

n-i . . 

a" . S aV 

i=o 

By the Cayley-Hamilton theorem, then, the process of bracketing terminates with 




The Lie algebra has the following multiplication table: 

[X 
[U 

and every vector field Z can be written as the sum of the vector fields 
X, u(°) , . . . , , 

That every vector field can be written in these terms is related to the notion 
of controllability of control systems. In fact, as was recalled in section 2, a well- 
known criterion for the complete controllability of the linear constant coefficient 
system x = Ax + Bu is that the matrix, whose columns are obtained from the columns 
of A^B, where A is n x n, have rank n: 

rank (B, Ab, . . . , A^ ^B) = n 

This is one example showing that the theory of the controllability of systems is 
related to the dimension of the Lie algebra generated by the families of vector fields. 
The literature is rich in regard to the connection with nonlinear systems (see, e.g., 
ref. 10). 

The Lie algebra of all vector fields on a manifold seems to be a very difficult 
object to study. There are many mathematical questions involving this algebra which 
will probably not be answered in the near future. However, in the next section we 
show that when the manifold is a Lie group there is a subalgebra which is intimately 
related to the Lie group structure. 
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LIE GROUPS, GROUP ACTIONS AND LIE ALGEBRAS 


The connection between Lie algebras and Lie groups will be seen in this section 
to be the same as the relationship between a linear differential equation and its 
solution. The connection will be discussed after what is meant by a group, by a Lie 
group, and by the action of a group have been clarified. 


Lie Groups 


Since the idea of a group is a purely abstract algebraic idea, the definition of 
a group should involve only a set of elements and some algebraic relations between 
them. A group, then, is a set of elements, like a set of matrices, any pair of which 
can operate together to give another element that is also in the same set: the 

product of two n X n matrices is an n x n matrix. Matrix multiplication is the 
operation for this example. When the integers are viewed as forming a group, addi- 
tion is the group operation. Besides having an operation between elements that yields 
another element of the set, a group requires the following conditions. To each ele- 
ment of the group there corresponds an inverse that is also in the group. One element 
in the group acts as an identity element: unity, in the case of multiplication; zero, 

in the case of addition. Finally, the operation is associative. 


Among matrices, for example, consider the unitary matrices discussed in section 2 
(unitary matrices whose elements are real are orthogonal matrices) . The set of all 
n X n unitary matrices, U(n), forms a group with the usual matrix product as the 
operation:, if A and B belong to U(n), so does AB. Since for every A there is 
an A^, its transpose, such that A^^A = AA*^ = e, A^ is the inverse of A. Also, the 
matrix e (or I) is the identity element belonging to U(n), and AI = A for every 
A G U(n). 


The discussion in section 2 showed that the group U(n) is a manifold, and that 
its dimension is (l/2)n(n - 1). The product of two unitary matrices is a unitary 
matrix, and section 2 showed that this operation was a manifold map. Furthermore, 
since the elements of the product matrix consist of sums of products of the elements 
of the two factor matrices and which are real or complex numbers, they are C“ func- 
tions (even analytic functions, that is, expressible as a Taylor’s series). The above 
properties characterize what are called Lie groups (or ’’continuous transformation 
groups,” in the older literature). Formally, a Lie group is a group that is a mani- 
fold and whose group operation yields manifold functions and is associative. 

Although Lie group elements need not be written as matrices, they will be thought of 
in that way in the sequel. 

To illustrate the notation, which follows the usual group notation, let g, h, 
and k be elements of the group. Denoting the group operation by **•”, gives 
•(,): G X G G; thus, • (g,h) = gh. Associativity gives that 

•(k,*(g,h)) = •(•(k,g),h). For the inverse, ( )” : G G; thus (g)“^ = g”^. The 
•(,) and ( )“^ are C” functions. The identity element, e, is a distinguished ele- 
ment of the group such that eg = ge = g for any and all g g G. 


Group Action 

The elements of a group are not restricted to operating among themselves. Their 
importance, in fact, comes from their operating on objects that don’t belong to the 
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same set. Thus, the importance of matrices is not that they may belong to a group, 
but that they operate on vectors, say, to form new vectors in old frames or old 
vectors in new frames. The action of a group refers to members of a group operating 
on nonmembers. 


Let M be a manifold and G be a Lie group. Suppose that there is a C" 

function t: G x M ^ M (with xCg.m^^) = m 2 , where m^^ and m 2 both belong to M) that 

has the following properties: 

(i) for all m e M, x(e,m) = m 

(il) T(g,T(h,m)) = T(gh,m) 

We say in this case that the group G acts on M. Any group of n x n matrices, say 
the general n-dimensional linear group over the reals, Gl(n,R), then, in operating on 

vectors in Euclidean space by the usual matrix multiplication, is said to act on (R’^. 


Another example, with a somewhat more involved action, and which is not linear, 
can be described in terms of linear fractional transformations. Let be identi- 
fied with Let G1 (2^) be the group of 2 x 2 matrices with complex elements. 

Define a map t:G1(2^) x ^ 9^ as 


X 




az + b 
cz + d 


(jj G 9f 


The mapping forms a group: 


X 








z 


) 


+ b2Ci)z + a2bj^ + ^ 2^1 

= ^ ^ 

(c2a^ + d2C^)z + C2b^ + d2d^ 

The group Gl(2^) is said to act on ^ by the map x. Now, the action is not well 
defined for z = -d/c. Hence, the particular maps must be considered to describe the 
action in local coordinates. 


One-Parameter Subgroups and Vector Fields 


Consider a subgroup a of a Lie group, in which the elements of the subgroup 
are given in terms of a single parameter. One element is given by a(t). A neighbor- 
ing element is given by a(t + h) if t + h is close to t. One parameter subgroups 
of a Lie group are distinguished by the properties: 


a(t)a(h) = a(t + h) 

a(0) = e, the identity 


(5.1) 


One corollary of properties (5.1) is that a ^(t) = a(-t). Another corrolary can be 
related to noting that the basic group property expressed in (5.1) makes a product of 
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elements at different values of the parameter correspond to an element at the sum of 
the parameter values. A correspondence like that is typical of exponential functions. 
How exponentials get involved from these properties is seen from considering 
derivatives: 


da(t) 

dt 


l±m [a(t + h) - a(t)] 
h->o 


= Zim [(a(h) - e)a(t)] I 
h-s-o ^ I 


= a(t) = Aa(t) j 


(5.2) 


where A is the limit as h 0 of (a(h) - e)/h. The limit exists because the group 
is a manifold whose coordinates are C*” functions. Equation (5.2) makes it easy to 
see why A is said to generate the subgroup, or is referred to as the infinitesimal 
generator of the subgroup. 


If a and A are real numbers, equation (5.2) has the solution 
a(t) = e^^a(O) = e^^ , confirming the connection of properties (5.1) with exponentials. 
The same form holds if A is a matrix, generating a matrix representation of the sub- 
group. In that case, the exponential function is understood to mean the series: 

exp (At) = I + At H- A^t^/2! + . . . 

These points can be illustrated by an example of a subgroup taken from the uni- 
tary group U(2): 

( cos t sin t\ 

) (5.3) 

-sin t cos t/ 


To illustrate (5.1): 


a(t)a(h) 


( cos t 
-(sin t 

( cc 
-si 


cos t cos h - sin t sin h 
cos h + cos t sin h) 
cos(t H- h) sin(t H- h)' 

sin(t H- h) cos(t 


+ h)\ 
+ h)/ 


sin t cos h + cos t sin h\ 
t cos h - sin t sin h/ 


cos 


The derivative can be calculated directly or through examining the limit shown above, 
and is found to be: 


0 


/ cos t 

sin 

1 

0/ 

\-sin t 

cos 


:) 


which gives for a representation of A the matrix: 

-c :) 


(5.4) 
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The solution of (5.4) should give (5.3), of course. To see that "exponentiating" the 
generator: A exp(At) gives (5.3), note that 


■r; 


A^ = -A 


Then 


2 2 

exp (At) = I + At + ^ 2 ^ "*■ 

_2 ^4 


/ t^ t^ 

( cos t sin t\ 
-sin t cos t/ 




as was expected. 

Now, a(t) is a curve on the U(2) manifold. To connect with earlier sections, 
note that it represents a homeomorphism of an interval. of the real line (R^ to U(2). 
It is a manifold map ^ U(2), It is a coordinate function of the manifold. 

Indeed, since the dimension of U(2) is one, a(t) is the only coordinate function, or 
one realization of it. Furthermore, a(t) belongs to an equivalence class [ot(t)]^. 

The tangent space at t = 0 is given explicitly as I + At, which is the tangent 
vector there. The vector field, X^, is (I, A), and in the earlier notation, = A. 

The element of the tangent space is uniquely determined by A. This is seen by 
considering: = [3]e if only if a(0) = 3(0) if and only if Aa(0) = B3(0) 

if and only if A = B. 


It is thus clear that every point of a one-parameter Lie group satisfies a linear 
first-order differential equation. It thus gives rise to a vector field which is an 
infinitesimal generator for the subgroup. The converse is also true, that every 
vector field acts as an infinitesimal generator of a one-parameter Lie group. When 
expressed as a matrix, the vector field can be exponentiated for the Lie group, the 
exponential form meaning the series expansion. The interpretation in the tangent 
space is simple: X^a(O) = (I + At)a(O). 

The previous section showed that a vector field is not only a vector space, but 
also an algebra. If X^ = A, then Xp^ = pA, for p a real number; = A + B; 

and similarly, for X[-^ the commutator of A and B, namely, AB - BA. The product 
AB, however, does not generally belong to a vector field. 


As pA, A + B, and (AB - BA) are vector fields, they should be generators of Lie 
groups. For the first, 

y(t) = pAy(t) =>• y(t) = exp(pAt)y(0) = a(pt) 

clearly a one-parameter group. For the second, 

y(t) = (A + B)y(t) 
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gives 


Y(t) = exp((A + B)t)Y(O) 

= I + (A + B)t + (A^ + AB + BA + B^) + . . . 

Now, exp(At)-exp(Bt) = I + (A + B)t + (A^ + AB + BA + B^)t^/2! = a(t)3(t). For small 
enough ^t, then, Y(t) = a(t)3(t), a one-parameter group. Conversely, 

Y(t) = a(t)3(t) + a(t)3(t) in the neighborhood of t = 0, so that 

Y(t) = (A + B)Y(t) 

Here, the correspondence between multiplication in the group and addition in the 
algebra is visible. 

To see that the commutator of matrices belonging to two vector fields also gener- 
ates a Lie group is more difficult. Care in examining the limiting process in taking 
the derivative of the group element a(t) 3(t)a”^ (t) 3”^ (t) gives the commutator as its 
infinitesimal generator, however. 

It is easy to verify these relations by examples , and they will be examined that 
way shortly. But first, it should be remarked that if what has been said about one- 
parameter groups were true only for them, then the information would be of little use. 
As a matter of fact, appreciation of the group properties derives from Sophus Lie's 
study of differential equations. The connection with useful situations is made 
through the action of the group on vector spaces. For example, consider a curve in 
^ that is given by the action of a one-parameter group: x = a(t)x(0). Now 

x(t) = a(t)x(0) = Aa(t)x(0) = Ax(t) . While Ax is a vector field on (R^, it still is 
also the infinitesimal generator of a Lie group a(t), the determination of which 
corresponds to finding the solution of the differential equation. The mental picture 
given is that of the curve x(t) being traced in (R^ by the evolution of the contin- 
uous transformation of the initial condition under the action of the group. 


Examples from the Symplectic Group 


Let and Xg be elements of a set of vector fields L(G) induced by the one- 

parameter subgroups a(t) and 3(t). , That Xp^, X^_,_g, and ^^so belong to L(G) 

will be illustrated by examples from Sp(2), the group of 2 x 2 matrices representing 
the symplectic group. A matrix M belongs to Sp(2)’ if: 


M 




(5.5) 


For examples , let : 

/cosh t + sinh t 




a(t) = 


I 


sinh t 


cosh t - sinh 


They both can be shown to belong to the symplectic group. Their infinitesimal 
generators are: 
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Now, 


pA 


‘■c :)■ .:) 

-c -> :) 


Hence, exponentiating A gives: 


exp(pAt) = I jl + + • . .j + Ajpt + ^ 

= I cos h pt + A sin h pt = a(pt) 


For X 


A+B’ 


A + B 


exp (A + B)t = 


-C 1 

\0 - 2 ' 

E < 2 '=)” 


n! 


(2n + 1)! 
E (-2t)" 


” 2t 1 . V oJ 

e Y sxn h 2t 


0 


-2t 


whose symplectic property is easily confirmed. It also agrees with a(t)g(t) to 
terms of first order in t. Finally, 


AB - BA 


■c: D 


To calculate exp(AB - BA)t, note that the eigenvalues of the commutator are ±3, 
that the eigenvectors are (1,2)^ and Hence: 


AB - BA 


'f Tf °)C 1 

\2 -1/ \o - 3/^2 -1/ 


and 
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and hence 


exp(AB - BA)t 


■c 


3 


0 




-J V 0 = ' 

)\2 

-J 



3t 

-e 

, -3t \ 

1 


+ 2e"^^ 

-2e^t 

-St 

- e ' 


L 3t - sinh 3t 


sinh 

3t 

4 

“J sinh 3t 

cosh 

3t +i 

sinh 


Again, that it is symplectic and forms a one-parameter subgroup is easily verified. 
On the other hand , from 


one gets 


exp (ABt) 



This proves not to be a one-parameter subgroup of Sp(2) (its determinant is wrong). 


If equation (5.5) is generalized from 2x2 matrices, the 2n x 2n 
belongs to Sp(2n,<R) if its elements are real numbers, exists, and 


matrix M 
MJm'^ = J for 


J 



^nxn 

0 


Then equation (5.5) can be written as: 

(5.6) 

Differentiating this form of the equation (and noting that MM = -MM holds 
because d/dt(MM“^) = 0 holds) gives: 

JM = -M ^MM 

. T *1 “T • T 

Then MJM = -MM“^J = JM m'*’ holds, by equation (5.6). This can be written as: 

SJ = -JS^ (5.7) 
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where S is given by S = MM ^ , or 

M = SM (5.8) 

Equation (5.8) shows that S is the generator of the one-parameter symplectic group 
M(t). Equation (5.7) says that candidates for the matrix S must have the form 


S 



m m 

that is, the off-diagonal blocks BB and CC are symmetric matrices and the diagonal 
blocks are related as shown. Note that for the examples discussed earlier, with 
a and 3 S)nnplectic, the matrices A, B, pA, A + B, and AB - BA all have the proper 
S form, but AB does not. 


There is an extensive theory of the relationship between Lie groups and their 
Lie algebras of one-parameter subgroups. This is one of those beautiful areas of 
mathematics where we have a category of objects (Lie groups) and a functor into 
another category (Lie algebras) and a great deal is known about each category. The 
advantage is that sometimes problems can be posed in one category, solved easily in 
the other, and the answer interpreted in the first. There are many such examples in 
mathematics but it has only been recently that such phenomena have been explicitly 
noted in control theory. Now, though, many such examples are evident in control 
theory. 


GRASSMANNIAN MANIFOLDS AND THE RICCATI EQUATION 


In this section, we develop in some detail a class of nontrivial manifolds, the 
Grassmannian manifold GP(V). The V in GP(V) is a finite dimensional vector space 
of dimension n. The manifold is the set of all p-dimensional subspaces of this 
n-dimensional space. We called it nontrivial for two reasons. One is that showing it 
to be a manifold is a nontrivial task that forms the heart of this section. The 
other is that these manifolds are important to control systems. That they provide a 
foundation to many important ideas in control systems theory is only now being 
realized. This section should let the patient reader understand something of the 
manifold and believe, through seeing it applied, that it has some use. 

We have seen that a C“ manifold is a pair (M,A) where M is a second countable 
Hausdorff space and A is an atlas of charts (of open sets and their maps). That 
gP (V) satisfies the definition of a manifold will be shown in this order. First, a 
subspace W^ is defined, its local coordinates and some properties described. Then 
r(W) is defined as a set of W^*s. It is seen to be an open set with local maps to 
Euclidean space. These maps are shown to be properly dif f eomorphic . Several F(W)’s 
and their maps thus form a suitable atlas for G^(V). Finally, that the gP(V) is a 
manifold is seen by establishing that it is Hausdorff and compact, not just 2nd 
countable. 

The following subsection shows how Riccati equations arise naturally in connec- 
tion with transformations on the Grassmannian. Several key properties of the equa- 
tions are established through the recognition that their source is their action on 
these manifolds. 


44 



Before studying the Grassmannian manifold, we will briefly discuss some aspects 
of linear optimal control. This apparent digression should help motivate the later 
discussion by giving some evidence that it is connected with the earlier considera- 
tion of the Lie groups and with control. It is assumed that the reader has at least 
an elementary knowledge of linear optimal control. 


Linear Optimal Control 

We discuss only the simplest problem of obtaining the optimal control of a time- 
invariant and continuous linear dynamical system. Given the system modelled by the 
set of n equations 

X = Ax + bu (6.1) 

is desired to choose the control, and u = u* so as to turn the functional: 

J = ^ Jcyp'QK + u^Ru)dt = J 5?(x,u)dt 

into the smallest real number the system allows. The parameters Q and R form sym- 
metric matrices and R is invertible. 


An approach to solving this problem defines a Hamiltonian function 


K(u*) = min[^(x,u) + p*^(Ax H- bu) ] 
u 


( 6 . 2 ) 


in which an n-dimensional multiplier, p, has been introduced. The multiplier raises 
the dimensionality of the problem to 2n but allows the conditions for ;K*(u*) to be 
obtained in a consistent way: 


BJT/Bu = 0 

x*^ = 9Jf/8p 

and 

-p = 85f/3x 


(6.3) 


Applying the recipe of equations (6.3) to (6.2) and transposing terms gives the 
equations 


u = -R'^b'^p I 

x=Ax+bu=Ax- bR~^b^p i 


(6.4) 


rji 

-p = Qx + A^p 


The last two equations in (6.4) are called Hamilton's equations. They can be written 
in matrix form as: 



(6.5) 
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The matrix H in equation (6.5) is sometimes called the Hamiltonian of optimal con- 
trol. It has the right form to be the generator of a one-parameter syraplectic group. 
The equation is solved for initial conditions on x(t) and final conditions on p(t). 


The complications due to solving these coupled equations with mixed boundary con- 
ditions can be made someone else’s problem by a transformation: 




( 6 . 6 ) 


The matrices I and P in equation (6.6) are n x n. If P is symmetric, the 
character of the Hamiltonian matrix as a symplectic generator will be preserved. 


Applying the transformation of equation (6.6) to (6.5) gives: 


dt 






A - bR'^b^P 


P - PA - a"^P + PbR ^b^P - Q -(A - bR 


-bR“^b^ ‘ 

- bR“^b'^P)^_ _n_ 


(6.7) 


Examining equation (6.7) shows that it leads to the decoupled equation 
I = (A - bR”^b^P)5 by choosing P so that ri = 0 at least at one time and so that 
n = 0 holds. These choices imply the conditions: 

P = PA + A^P - PbR"^b'^P + Q (6.8) 

and 

p(t^) = P(t^)x(to) (6.9) 

Equation (6.8) is a matrix Riccati equation. Equation (6.9) shows that the multiplier 
constrains x(t) to a curve on an n-dimensional manifold. Though the manifold of 
interest is the one which supports the trajectory of the controlled dynamical system, 
it is characterized by the selection of p as a complement to x in the 
2n-dimensional space. 


The procedure just described may seem to be a novel way of selecting the multi- 
plier to solve the optimization problem. In fact it is an illustration of an approach 
to the study of geometry which was introduced by Grassmann in the 1840’ s. Following 
the earlier studies of projective space, he looked into the relations between a curve 
given in one space by a set of parameters and the curve it induced in the parameter 
space. In our example, x(t) is a curve with certain parameters which describe a curve 
in a dual space. The transformation is generated by the Riccati equation. In a 
general Grassmannian, the two spaces need not have the same dimension. 


The Grassmannian 

That the Grassmannian G^(V) is a manifold will be developed in this subsection. 
Now, G^(V) is the set of all p-dimensional subspaces of an n-dimensional vector 
space, V. It is covered by an atlas of charts F(W). Each F(W) is the set of 
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subspaces given by the local coordinate map written as the (n - p) x p matrix, 

A, First, a simple example will be given to fix ideas of the space V and a sub- 
space W^. Then the nomenclature will be explained in detail. 

A simple example .- Take V as the plane , and let U and W be one-dimensional 

subspaces of it such that V is their Cartesian product: V = U x w. Now, U and W 
are to have no subspaces in common, but since they are to be vector spaces, they do 
share the point (0,0) of V in this representation, which will be referred to as the 
set {0}. The situation can be visualized as in sketch (h) , where one can think of the 
plane V with oblique coordinate directions U and W defined; 



Sketch (h) 

The point P in the sketch belongs to the subspace W^. It is specified uniquely by 
u + Au, with A a real number (a 1 x 1 matrix) and u G U. Every point in is 

specified similarly by some u, and every point in the plane except W itself 
belongs to a for some A. The collection of these for the given choice of 

W is an open set T(W). Included in a chart is W for a^different subspace W of 
V, where now V = U x w. Then W is, say, the subspace and belongs to the open 

set r(W). Now and F(W) will be described in general. 

The subspace Wa »- Let U and W be subspaces of V such that the dimension of 
U is p, the dimension of W is n - p, and V is their direct sum. By the direct 
sum is meant that U and W have no subspace in common; their intersection contains 
no subspace. Consider the subspaces such that: 

= [u + Au:u G U and A G L(U,W)] (6.10) 

Now note that the dimension of is p; let u^^, . . ., Up be a basis for U, and 

suppose oti(u 3 ^ + Auj^) + . . . + WpCup + Aup) = 0. Then 



and 

E“i“i = 0 

since V = U ® W. Thus, all of the = 0 and we conclude that {uj[ + Au^:! < p} is 
linearly independent and is a basis for W^. The space is a p-dimensional sub- 

space determined by the linear transformation A. 
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The decomposition of is unique because U and W have zero intersection; 

suppose = Wg, then + Au^ = + Bv^ for some choice of and we can assume 

that the form a basis for U. Then '^± ^ - Au^. This implies that 

ui = Vj[ because u^j^ - Vj[^ E U, Bvj^ - Au^ E W, and U n W = 0. We conclude that 
Auj^ = BUji^ for a basis of U; thus A = B, 

Not every p-dimensional subspace of V can be represented as a W^. We can 
prove the following fact which will be useful later: 

A p-dimensional subspace Uq can be represented as W^ for 
some A iff O W = {0}* 

Proof: Let z e Uq n W; if Uq = W^ then z = u + Au for some u and so 
u = z - Au. Since z E W and Au E W we have that u E W, hence, is zero. Thus, 

Uq = W^ implies that Uq n W = {0}. On the other hand, suppose Uq Pi W = {0}. Let 
s^, . . ., Sp be a basis for Uq and write each s^ as = Uj^ + w^. This decom- 
position is unique. Now, the uj[ form a basis for U; for, suppose that 

= 0 

Then we have that 

E a.s, = Y'a.w, 

XI ^ X 1 

which implies that 

E“iSi = 0 

so that = 0 for all i. Thus, {u^: 1 < p} is a basis for U, and no subspaces 
are left in Uq to be generated by any of the w^. Let A be the linear map that 
takes U;i^ to Wj^. Then Uq = W^. 

The charts r(W) .- Let r(W) be the set of all W^. Thus r(W) is indexed by the 
set of linear maps, L(U,W) from U into W and we've seen that each element of r(W) 
is uniquely associated with an element of L(U,W). We can define a map from F(W) to 

fl(n-p)>?p 

(KW^) = A 

The map (j) depends on a choice of basis for the space V. It can be shown that every 
p-dimensional subspace of V is in r(W) for some choice of W. Then (r(W),())) are 
suitable candidates for charts and chart mappings. Their differentiable structure 
needs to be verified. 

With reference to sketch (i) , let S be an element of F(W) and F(W') and sup- 
pose S = W^. We need to determine a T such that S = Wrp. Each element w of W 

can be written uniquely as a sum of elements from W' and U. As illustrated in 
sketch (j), let w = u^ + w[. Define two functions A 3 ^ and A 2 by A^ (w) - ui and 
^ 2 (w) = w[. Since the decomposition is unique, it follows that A^ E L(W,U) and 

A 2 ^ L(W,W'). It will be useful later to note now that A 2 is invertible. This is 

shown by supposing A 2 (w) = 0. Then w = A^ (w) . But this implies that w E U, which 
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Sketch (j) 

is equivalent to w = 0. Thus, A 2 is one to one, and since W and W' have the same 
dimension, A 2 is invertible. Recall that: 

S = = {u + Au: A e L(U,W)} (6.11) 

From what has just been derived, we can write; 

= {u + A;^Au + A 2 AU: A G L(U,W)} 

= { (I + AiA)u + A 2 AU; A G L(U,W)} (6.12) 

== {u’ + A^Ad + A;^A)“^u': A G L(U,W)} 

Since A 2 A(I + A^A)”^ G L(U,W), we have that 

«A - 
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since the derivation can be run backwards. It can be seen that if S G F(W) O r(W*), 
then the inverse actually exists. Since the coordinate change is a rational function 
with nonzero denominator, it is C“. The significance of its fom will be come clear 
later. 

The manifold gP(V) *- The sets F(W) have now been defined. It was said that 
every p-dimensional subspace belongs to some F(W), because some appropriate comple- 
mentary subspace W can be found for it, and an appropriate map obtained. Hence 
gP(V) is covered by the sets F(W). It remains to show that G^(V) is Hausdorff and 
compact. 

It is easy to see that it is Hausdorff. This follows from the fact that given 
two p-dimensional subspaces Si and S 23 there exists a W such that Si and S 2 are 
both in F(W). Therefore, both are in the same which is Hausdorff. Thus 

gP(V) is indeed an (n - p) xp-dimensional manifold. 

The proof that G^(V) is compact relies on the theorem that the image of a con- 
tinuous map of a compact set is compact. Now, the unitary group, U(n), is compact. 

The strategy is to show that there is a continuous map, T, from U(n) to G^CV) so that 
GP(V) inherits a natural structure from U(n). That T is continuous will follow if the 
inverse image of any open set ^ n F(W) of G^(V), T^^C^), is open in U(n). 

Define T from U(n) to G^(V) by: 

T(a) = a(U) (6.14) 

where U is the p-dimensional subspace of the definition of G^(V). Since 
a G U(n), oj(U) is p-dimensional and so is in G^CV). It remains to show that T is 
continuous . 

Let ^ be an open subset of some chart F(W). To prove continuity, it suffices 
to show that (fi) is open. Since ^ is contained in F(W), 0 is of the form 

{Wa^ a ^ 0 ^ (R(^"P)p, 0 open}. Now, 

T"^(Wa) = T"H(x,Ax):x G U} 


so we look for the a such that a{(x,0):x G U} = {(x,Ax):x G U}. Partitioning a 
as 


we have 


and 



a(x) = {(a^x,a 3 x): x G U} 
= {(x,a 3 ai^x): x G U} 

T^^(0) = G } 


(6.15) 


(6.16) 


(6.17) 
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Continuity of matrix multiplication assures us that the set of alJL matrices 


{(oti,a3): cL^al e } 

is open as a subset of hence, that the set of all matrices 


b 


cti 

«3 


(6.18) 


(6.19) 


with € 0' is open as a subset of . Thus, the intersection of this open 

set with U(n) is open and T“^(^) is an open set. So we have that T is a continu- 


ous map from U(n) onto G^(V). 
is compact. 


Since U(n) is compact and T is continuous, G^(v) 


We have now constructed a manifold from a rather complicated set. It figures 
prominently in several sets of developments in control theory, one of which concerns 
solutions of the Riccati equation. 


An Application of the Grassmannian 

As an application of the Grassmannian which will be of some importance to those 
reading the control theory literature, this subsection will discuss how the Riccati 
equation arises naturally in this context, and what some of its properties are. 

The Riccati equation .- It was seen earlier in this section that the Riccati equa- 
tion arose in connection with the transformation of Hamilton's equation. Here we will 
see that it arises quite generally as the generator of the transformation of the 
Grassmann manifold. First, the general transformation will be given. Then its 
dynamic behavior will be seen to be governed by the Riccati equation. 

To find the transformation, we examine the action of the general linear group, 
Gl(n), on G^(V). Let a G G1 (n) and partition a as in (6.15). Note that an action 
of Gl(n) on G^(V) is well defined by: 

(a,W) -> a(W) 

where W is a p-dimensional subspace of V. The local representative of this 
action is of interest. 


u(Wa) = a({(x,Ax): x € U}) 

= { (ot^x + tt 2 Ax, U 3 X + a^Ax); x G U} 


Now, if ot(W^) G r(W), then there is a B such that a(W^) = {(z,Bz): z G U}. In 
particular, (ai + U 2 A)x = z has a solution for all z G U and hence (a^ + a 2 A)”^ 
exists. Thus 


a(W^) = {(z,(u 3 + Ui^A) (uj^ + ^z): z G U} 

^(ag+ajjA) (a3^+a2A)”^ 


( 6 . 21 ) 
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whenever + CX 2 A) ^ exists and the inverse exists iff a(W^) e F(W). Also note 
that the inverse exists iff oi(W^) has zero intersection with W. 

The function 


a ; A (ag + a^A) (a^ + a 2 A)“"^ (6.22) 

is called a generalized linear fractional transformatio n, and is the transformation in 
parameter space that was sought. 


To examine its dynamic behavior, let t a(t) be a one-parameter subgroup of 
G1 (n) , and let 


B(t) = (t))a(t)-" 


(6.23) 


be its infinitesimal generator. Let be some fixed element of G^(V). We have 

that t a(t)W^ is a curve in G^(V). Let us calculate the vector field with which 
it is associated. Writing a as before, we have a = Ba(t): 


■»! 

“ 2 " 


1'^l 

+ 

®12“3 

^ll“2 

+ 

®12^4 

-“3 

“4. 



H- 

^22^3 

^21*^2 

+ 

®22^4_ 


(6.24) 


From (6.22) we see that we must calculate 

^ [(ctg + a^A) + a 2 A)~^] 


(6.25) 


Using the recipe: 


d 

dt 


XY"^ = 


- XY“^YY“^ 


gives for (6.25): 

(otg + ct4A) (a;L " (^3 ■*" CX4A) (a^^ + a2A)“^(cti + 6t2A) (a^^ + a2A)“^ 

Substituting for the otj[^ from (6.24) and collecting terms in (6.26) gives: 

[£21^1 ^ 22^3 (®21^2 ^ 22 ^ 4 ^“^^ ^2^^ 

- (Og + a^^A)(a^ + a2A)”^ + a^A)”^ 

= B2;^ + B22(a3 + Oi^A) (ttg^ + a^A)"^ 

- (ag + ajjA)(ag^ + a2A)“^[Bj^g^ + Bg^2(“3 + “4A) (a^ + a2A)"^] 

= B 21 + B22(aW^) - (ctW^)Bii - (aW^)B^2(“V 


(6.26) 
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Thus, we show that the curve a(t)W^ satisfies the differential equation: 


P(t-O) = A 


+ PBi^P 


(6.27) 


Thus, Riccati differential equations are associated with a natural group action on 
the manifold G^CV). 


Some properties of Riccati equations .- There are many properties of the Riccati 
equations that can be deduced just from knowledge of the manifold and the fact that 
they are associated with a group action. For example, the Grassmannian manifold 
G^ (V) shares with the sphere the property that every vector field vanishes at at least 
one point; i.e. , there is a point W G gP(V) such that A(t)W = W for all t. 

Now if W e r(W') then with respect to the local coordinates defined by r(W’) 
we have that 

0 = P(t) = + B^^P - PB^;l " ^^12^ (6.28) 


and hence, that there is a solution of the algebraic Riccati equation. However, given 
the algebraic Riccati equation (6.28) there may not be a matrix P that satisfies 
(6.28). The guaranteed solution may not be in the required chart, i.e., the solution 
exists at 

« 

In the rest of this section we will study some of the elementary consequences of 
the fact that Riccati differential equations are associated with a group action on 
C^(V). As a first problem we ask: 

Problem 1. What are necessary and sufficient conditions for the algebraic 
Riccati equation to have a solution? 

By the algebraic Riccati equation we mean the equation 

®21 + ^22^ “ “ ^^12^ ^ (6.29) 

An equivalent question is: what are the constant solutions of (6.27)? 

The general theory tells us that every vector field on G^(V) vanishes at 
at least one point so we know that in the large there are solutions although they may 
be "solutions at infinity." So instead of searching for solutions of (6.29) we can 
look for p-dimensional subspaces, W, of V such that 

a(t)W^ = (6.30) 

for all t e (R. Unfortunately, all we know about a(t) is that it has infinitesimal 
generator B. Thus, we need the following lemma. 

Lemma: Let a(t) = Ba(t) and let W G gP(V). Then a(t)W = W for all 

t e (R iff BW C W. 
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Proof: Suppose a(t)W = W. Let Wj[, 1 = 1, . . . ; be a basis for W, then 

a(t)w£ = 2 a..(t)w. and we have as a consequence that a(t)wj^ = 2 j (t)wj so 
ij J J J 

that a(t)W C W. Thus, 

Ba(t)W = BW 

and 

Ba(t)W = a(t)W C W 

so that BW C W. On the other hand, assume that BW c W. Now a(t) = exp(Bt). Using 

the definition of exp(Bt) and the fact that B^J C W for all k it follows that 

a(t)W = W. 

The lemma reduces the problem of finding invariant subspaces of a(t) to finding 
invariant subspaces of B, technically at least, an easier problem. We can now 
restrict ourselves to the study of the invariant subspaces of B. Our problem has 
been reduced to asking if there is a p-dlmenslonal invariant subspace of B that has 
zero intersection with Uq (recall pg. 48). We answer this question first in the 
generic case. 

Assume that B has n linearly independent eigenvectors . . ., tIj^. We 

claim that in this case there is always such a solution. Assume rii ^ Uq. There is 
such an rii for if not the n eigenvectors span a p-dimensional space that contra- 
dicts their independence. Assume has been constnucted such that 

Vk = . . ., njj) and H Uq = {0}. If k = p we are finished, if not then 

k < p (k cannot be greater than p and n Uq = {0}). Let Wj. = {Vk,rir) » for 

r > k and suppose Wj. n Uq ^ {0} for all r > k. Then we have that 

k 

Ti^ + ot^nj Uj , J— k4"l, . . .,n (6.31) 

There are more than n - p equations and thus there are nonzero aj[ such that 

n 

2 + a^n^) = Ta^u^ = 0 

j=k+i 

Therefore, ajoj =0 for j = k + 1, . . ., n because of independence and we have 
that aj = 0 since aj cannot be zero by hypothesis but this contradicts the 
existence of such a set of uj and so for some r 

Wr n Uq = {0} 

Renumbering if necessary, let = Wj.. We eventually have a Vk such that 

Vk ® Uq = V; hence, there is a P such that Vr = Wp and P is the desired solu- 

tion of (4.29). Potter (ref. 11) essentially recognized this fact. 

A second problem can be considered. We have long associated various groups with 
differential equations and have asked what group leaves some properties or form of the 
solution of a differential equation Invariant? Of course, in that imprecise form 
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there is no precise answer but maybe an example will clarify the issue somewhat. 
Consider an ordinary linear differential equation 

X = Ax (6. 32) 

We are all used to the concept of changing basis in state space to transform the 
equation to 


z = (aAa"^)z (6.33) 

Such a transformation has the useful property that it leaves the linearity invariant. 
Let us now examine the following problem. 

Problem 2. Is there a large group of transformations that leaves the quadratic 
nature of the Riccati differential equation invariant? 

Now this problem is perhaps a little harder than it would first appear. Consider 
again equation (6.27) and make the transformation 

P = aS 

where a G Gl(n - p) . Then S satisfies 

S = + (a"^B22Ct).S - SCB^i) - S(Bi2d)S (6.34) 


and so the Riccati nature is preserved. However, it is well known in the control- 
theory literature that if 

S = P“^ 


then S again satisfies a Riccati equation; for 


PS = I 


implies 


PS + PS = 0 


which yields 


P"^PS + S = 0 


and 


§ = Bi 2 + BiiS - SB 22 - SB 21 S (6.35) 

Thus, the group we are seeking is larger than just a linear change of basis. In fact 
we will demonstrate the following: 

Theorem : The class of Riccati equations is invariant under transforma- 

tion by generalized linear fractional transformations; i.e., if 
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S = (a^P + a^) ^ 4 ) 


-1 


then, if P satisfies a Riccati equation, so does S. 

To prove this we will work in the global situation rather than in the local coor- 
dinates. Let P(t) be the solution of (6.27) and let A(t) be the one-parameter sub- 
group such that 

Wp(t) = A(t)W^ (6.26) 

Let B be the infinitesimal generator of A(t). From (6.14) we know that linear 
fractional transformations of P are associated with the action of G1 (n) on G^(V) 
and so we are asking what (if any) differential equation does the curve 

t (6.37) 


satisfy? 

Now obviously 

= aA(t)W^ 

but oA(t) is not a one-parameter subgroup (it is not a subgroup). However, we do 
have 


and aA(t)a"'^ is a subgroup. This if 



(6.38) 


and 


S(t) = (Ugi + a22P(t))(aii + ^ 


then 


Wg(t) = aA(t)a"^aWjj) 

and so S(t) satisfies a Riccati differential equation. The infinitesimal generator 
of 


A(t)a"^ 

is aBa~^ and so the coefficients of the Riccati equation of S are related to the 
coefficients of the Riccati equation of P by 

B aBa“^ 


This is not easy to show directly. 
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Following this line of thought, one can mention two special cases. In the first 
case, suppose B has n distinct eigenvectors. Then, there is an a such that 
aBa“^ is diagonal and the associated Riccati equation is 

S = + SD^ 


where 


otBa"^ 


Thus, B is equivalent to the study of 

For the second case, consider the 
of the form 



simpler linear differential equations. 

fact that every B is equivalent to a matrix 



where E is the matrix (n - p) x p which is zero except possibly for the element at 
the l,p position of E. Thus, every Riccati equation can be transformed into a 
linear equation. 


S = E + SB^ - Bj^S 

This fact is rather elementary but doesn't appear to have been noted in the literature. 

There are many results that can be proven about Riccati equations using the 
Grassmannian techniques. However, they belong more properly to a research monograph 
than to an introduction to differential geometry. 


CONCLUDING REMARKS 


We have attempted in this report to give an informal introduction to differential 
geometry that would be palatable for a nonmathematically trained engineer. It is 
primarily intended for the control engineers, but we hope that persons in other dis- 
ciplines will be interested in learning these basic facts. 

Anyone who has read these notes by now realizes that a lot has been left unsaid. 
If one tries to apply differential geometric techniques to real problems he will 
quickly see that this material must be augmented by more powerful results. The pur- 
pose of these notes was to prepare the reader to delve into the more specific areas of 
differential geometry. 

There are many excellent books available. The authors have found the following 
to be particularly useful. "Calculus on Manifolds" by Spivak (ref. 4) should be read 
by anyone interested in differential geometry. It is short and concise, and it sets 
the stage for serious study. 
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"An Introduction to Differential Manifolds and Riemannian Geometry" by W. Boothby 
(ref. 2) is readable and is as first class a source book as is "Differential Geometry 
and the Calculus of Variations" by R. Hermann (ref. 1). There are many other intro- 
ductory books on differential geometry available — some are quite good and others are 
only mediocre. The problem with most is that they were written for mathematicians 
assumed to have good mathematical background as well as mathematical sophistication. 

At the more specialized level there are the monographs on mechanics by Abraham 
and Marsden (ref. 12) and the monograph by Arnold (ref. 13). Both are advanced and 
sophisticated but are very well written. Both books deserve to be in everyone's 
reference library . 

For control theory one should be familiar with the various books of R. Hermann. 

In addition, one must consult the current literature. The SIAM Journal of Control and 
the IEEE Transactions on Automatic Control both contain occasional papers on geometric 
control theory. Many of the papers at a Harvard workshop sponsored by NASA-Ames, 

NATO, and the AMS (refs. 14 and 15) have the concepts of manifolds and of differential 
geometry at their core. 

One should remember that control theory problems don't arise for the benefit of 
differential geometer and every available method should be used to obtain a satisfac- 
tory solution. Differential geometry is just a tool and the control engineer should 
not restrict himself to one tool but should be familiar with as many as possible. 


Ames Research Center 

National Aeronautics and Space Administration 

Moffett Field, Calif. 94035, February 1982 
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APPENDIX 


FUNDAMENTALS OF VECTOR CALCULUS 


This appendix recalls a number of concepts usually covered in an advanced calcu- 
lus course. Though it might serve to prepare the reader psychologically, its true 
purpose is simply to express explicitly the ground rules for the contents of the body 
of the paper. 


REAL EUCLIDEAN SPACE 


The real linear vector n-space, (R^, is the set of all n-tuples of real nimibers. 
That is, 

= { (xi , . . • , x^) : x^ G (R, i = 1 , . . . , n} 

These numbers, when added together such that x + y = (x^ + yi> . . • > + Yn) 

multiplied by real ntambers so ax = (ax^ , . . ax^) for a G (R, give results that 
also belong to the same space (R^. A complex vector space is defined accordingly, 
and is denoted by 

The vector space becomes a real euclidean space (also often denoted by (R^) when 
it has a particular measure of size, a norm. This norm is just a generalization of 
the concept of length in ordinary 3-space, (R^. For x G (R^, we define the norm of x 
as the real number: 

|x| = (x^ + . . . + 

Intuition holds, and |( )| has all the properties of length that one expects. 

The inner product of two vectors in (R^, x and y, can be defined as: 

x-y = x^y^ + . . . + x^y^ 


The norm and inner product functions are related by: 

x-x = [x(^ 

We will say that x is orthogonal to y if 

x-y = 0 


TOPOLOGICAL SPACES 


Even though the concepts of norm and inner product are essentially vector space 
concepts , they give rise to a convenient mode of expression of topological properties 
that are necessary to the development of calculus. 
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^ open ball of radius r at x^, , is the set of all those points lying 

within a distance r of the point x^: 

S (x ) = {x: lx - X I < r} 

o^ ‘ o' 

A subset U of U C (R^, is said to be open if, for each x in U there is an 

r such that S^(x) C U; that is, U contains an open ball about each of its points. 

A subset C of iR^ is said to be closed if its complement, that is, the set of all 
X that do not belong to C, {x: x ^ C}, is an open subset. The collection of all 
such open subsets of (R^ constitutes one possible topology for (R^. We will refer 
to it as the usual topology, the Euclidean topology, or as the metric topology, 
depending on how precise we feel like being at the moment. The discussion will 
usually concern (R^, perhaps with the Euclidean notion of distance. There will be 
occasion later in this section to refer to objects that are topological spaces, but 
which are not necessarily metric subspaces of (R^. 

A typical exercise throughout this paper will be to refer whatever object that is 
being studied back to the real numbers or to spaces constructed of strings of real 
numbers. The reason is simply that we know the properties of real numbers. If we 
can relate the object under study to the real numbers by a continuous function with a 
continuous inverse, then we can transfer fundamental properties of the real numbers 
to the object of interest. 

What the fundamental properties are that are needed for calculations and how 
objects differ by having different ones of these properties is the concern of point 
set topology. Our interest in topology is simply to say what some requisites are for 
the operations of differential geometry. Since the treatment is introductory, we will 
not discuss the various classes of geometric objects that one can come upon. 

Our use of topology, then, is like the scientist's or engineer’s interest in 
fundamental physical standards like the standard meter or the standard kilogram. To 
make a comparison of length, the scientist lays the standard along his rod which he 
marks according to the standard's marks. The mathematical equivalent of this laying 
alongside is finding a continuous function that relates an open set of the standard 
real numbers and an open set of the other mathematical object. If one matching 
satisfies for the whole object, fine. Usually the comparison is done in pieces to 
prevent uncertainty and equivocal results using as many open sets as are needed. The 
type of number of open sets required is important mathematically and helps classify 
the geometric object. We will need a finite or, at worst, a countable number of them. 

A topological space , in general, is a pair (X,f2) where X is a set and Q is a 
collection of subsets of X with the property that the empty set, (J), is in the 
intersection of any two elements of Q is in Q, and the union of an arbitrary number 
of members of 9, is also in 9. The elements of 9 are called the open sets for the 
topological space (X,f2). For example, (R’^ with an 9 consisting of sets that are 
open in the sense defined in the previous paragraph, ((R’^jQ), is a topological space 
(with the usual topology). 

There are two extreme examples of topological spaces that can be constructed out 
of any set X; namely, when 9 - {^,X} and when 9 = {all subsets of X}. In the 
first, the "indiscrete" case, 9 has too few sets to be very useful; in the second, it 
has too many. Requiring that the following two conditions hold for the space elimi- 
nates those two extremes from further consideration. 
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The first condition to be imposed is that the space (X,J2) be Hausdorff: given 

any two distinct points x and y in X, there are two open sets U, and V such 
that X e U, y G V but their intersection is empty, U n V = ({). This condition is 
essential for analysis because it assures us that sequences converge to unique limits. 
Since it also guarantees that Q has a lot of open sets, it rules out the indiscrete 
topology from selection. 

The second condition to be imposed is that the space (X,f2) be second countable; 
that is, that there be a countable subset of Q, S = { 8^,829 . . .} such that every 
element of Q is the union of a finite number of intersections of elements of 8 . 

Then X is said to have a countable basis of open sets. Thus, this condition assures 
us that there aren't too many open sets. An example of a second countable topological 
space is and the metric topology. A suitable set 8 for this Q is the set of 

all open balls Sj-(x) where r and the coordinates of x are rational numbers. 

Set 8 is a countable set since it is a countable set of countable sets, namely of 
the rational numbers. 


COMPACTNESS 


So far, our discussion of fundamentals has gone from a particular topological 
space, namely, real Euclidean space, to the more general topological space (which will 
be seen later to be required for control applications) — a "second-countable 
Hausdorff" space. Requiring that kind of countability and separability is as general 
as we get. This next topic of compactness refers to an additional property that is 
very useful for the spaces to have. Research papers often discuss a concept up to a 
certain point and then invoke compactness to enable satisfyingly tidy conclusions to 
be reached (or untidy aberrations to be avoided). Since as a matter of fact it is 
unusually the tidy conclusion that raises the interest of the design engineer, it 
pays off in practice to look quite hard at the space one happens to be working with to 
see if it is compact. 

In a Euclidean space, compactness is tantamount to the space being closed and 
bounded. A closed space contains all its limits so that sequences can end in the 
space. Boundedness means that every point is within reach — paths don't end at rain- 
bows that constantly recede. The useful property abstracted from closedness and 
boundedness is that together they imply that not only can one count the number of open 
sets one needs to cover the space of concern, but also that after a while one finishes 
the counting. This is the property of compactness: that only a finite number of open 

sets are required to cover every point of the whole space and still have it that any 
two points can belong to different open sets. 

A comparison of two simple two-dimensional spaces may clarify the idea. The 
plane is not compact; the sphere is. The plane is "compactified" by finding functions 
that map it onto the sphere. 


CONTINUITY 


It is assumed that the reader is familiar with the concept of functions defined 
on (R , and with continuity of functions as defined in terms of limits. Continuity 
can also be defined in a way that makes sense in arbitrary topological spaces. (From 
now on, we will refer to the topological spaces and (Y,^^y) by just X and Y.) 
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Consider two topological spaces X and Y- Let f be a function with domain X and 
range Y. Let U be an open set in Y. The inverse image of U is the set of 
elements in X that maps into U; f~^(U) = {x G X: f(x) G U}. Then we say that f 
is a continuous function if for each open set U in Y, f"*^(U) is open in X. If X 
and Y are Euclidean spaces, then this definition in terms of open sets is equivalent 
to the one utilizing limits and open intervals or balls. 

Continuity is a key property that will be required throughout the following pages 
every time two spaces are related to each other. It preserves a topological consis- 
tency in relating two spaces that is expressed by the word “homeomorphism." Spaces 
are homeomorphic if they are related by a continuous function that has a continuous 
function as its inverse. Then the topological consistency arises from such facts as 
that the continuous images: of open sets are open; and of compact sets are compact. 

That these structural properties carry over is a basic requirement in differential 
geometry. In fact, it is generally required that various orders of derivatives of the 
relating functions be continuous. Then the functions are called dif feomorphisms . 


DERIVATIVE 


The reader will recall that the definition of the derivative of a function f 
with domain an open subset E of and with range in (R® is slightly more com- 

plicated than in the case of a function of a single variable- We say that f is 
differentiable at x G E if there is a linear map A(x), a Jacobian matrix, from 
to iR™ such that 

lf(x + h) - f(x) - A(x)h| ^ ^ 


then A(x) is called the derivative of f , and we will let the Jacobian matrix be 
called f'(x); A(x) = f'(x). At any particular place, x, A(x) is a particular linear 
map from (R^ to (R™. Hence A( ) is a function whose domain is E and whose range 
is the space of linear maps from (R^ to denoted by L((R^,(R°^). Now L((R^,(R™) 

can be identified with the n x m matrices and hence with <R™^. Thus A( ) is a map 
from (R^ to (R^^^ whose continuity and differentiability can be discussed. 

We say that f is a C^(E) function if A, its derivative, is continuous, and 
we say f is a C^(E) function if A is a C^(E) function. If f is a C^(E) 
function for all n, we say that f is a C°°(E) function. Note that even though this 
says that f- has derivatives of all orders, this is a strictly weaker condition than 
saying that f is represented by a power series. The function 

{ exp(-l/x^) , X > 0 

0 > X < 0 

is the classical example of a C°° function that is not represented by a power series 
at X = 0. A function that is represented by a convergent power series is called 
analytic . In this report, all functions, except those that are solutions of differ- 
ential equations, will be assumed to be C“. If both f and its inverse, f”^, are 
C" functions, then f is said to be a d i f f eomor phi sm . 
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INVERSE AND IMPLICIT FUNCTION THEOREMS 




The section on manifolds, the basic notion of a space in differential geometry, 
shows that a manifold is very closely associated with functions that are implicitly 
defined. The following discussion may refresh the reader's memory of such implicit 
functions. 

Suppose one has a function f which maps x (R™ into that is, 

f : (R^ X (R^ ^ (R“; 

f = f(xj^, . . ., x^, y^, . . ., y^) 

= (f 1 » . . . , f^) 

= ( 0 , - . 0 ) 

The implicit function theorem gives rather general conditions under which f can be 
reformulated so that m of the variables, y, can be expressed explicitly as a func- 
tion of the other n variables x: y = z(x). 

A simple example shows that the m-valued function f can be expanded to an 
(n + m) -valued function which is invertible. Let f be given by 

^ 2^2 ^ 3^3 " b = 0 

Two identities, X 2 - X 2 = 0 and X 3 - X 3 = 0 can be added trivially to form the set: 


El a.2 


"Xl" 


"b” 

0 1 0 


X2 

= 

X 2 

_0 0 1_ 




_^3- 


Then if the determinant of the coefficient matrix is not zero, that is, if a^^ ^ 0, 
then the set of equations can be solved for x^ in terms of X 2 and X 3 . 

The example shows how finding the explicit function X;^ , from its implicit 
representation a^x^ + a 2 X 2 + agXg - b = 0 involves finding an inverse. Hence, not 
only is it useful to state the Implicit Function theorem, but it also is desirable to 
state the Inverse Function theorem as well, to serve both as a lemma and as a special 
case. The two theorems will only be quoted here. The reader can consult any advanced 
calculus book for a discussion and interpretation of them. 


Inverse and Implicit Function Theorems 

Inverse function theorem .- In reference 4, page- 35, suppose that f: (R^ is 

a continuously differentiable function on an open set containing the point a and 
suppose that the Jacobian, det f'(a), ^ 0. Then there is an open set V containing 
a and an open set W containing f(a) such that f: V W has a continuous inverse 
f*~^; W V which is differentiable and which for all y G W satisfies 

(r^(y))' = (f (f"(y)))"^ 
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In the next theorem, the expression Djf^(x) refers to the derivative along the 
j'th coordinate of the ith component of the vector-valued function f with the 
value taken at the point x. 

Implicit function theor em.- In reference 4, page 41, supposing f: x 

is continuously differentiable in an. open set containing (a,b), and that f(a,b) = 0. 
Let M be the m x m matrix Dn 4 ,jf^(a,b) , 1 < i, j < m. If det M 0, then there 
is an open set A C containing a and an open set B C (R™ containing b with 
the following property: for each x € A there is a unique g(x) G B such that 

f(x,g(x)) = 0. The function g is differentiable. 

The background in advanced calculus which this appendix has recalled is a suffi- 
cient collection of information for this report. There are many other small details 
that are useful to recall, and they are pointed out as the need for them arises. 
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